Vector representations of sentences

  • Last Update:October,31,2017 Created:October,31,2017
  • Comment
  • Like
  • Favorite

Public

Profile

Title of the dataset Vector representations of sentences
Provenance of the dataset Refining Raw Sentence Representations for Textual Entailment Recognition via Attention
How were the data collected/created? What was the cost? The data was created by training and end-to-end Neural Network Classifier capable of taking a pair of sentences from the MultiNLI dataset as input and predicting whether the pair belonged to one of three categories. It is difficult to estimate the cost, but learning about the theory behind the implementation and implementing the code took around 700 man-hours.
Data sharing policy Other
Data sharing policy

About data analysis and simulation

Type of data: Check all that apply. Use "Other" to specify other types so that we can include them in further updates. text number
Variable labels of dataset (the names of the variables) SENTENCE_TYPE|REPRESENTATION_VECTOR|SENTENCE_PAIR_ID
Outline of data This data includes 300-dimensional sentence representations of the validation set pertaining to the MultiNLI dataset, first published in the RepEval 2017 shared task of the EMNLP 2017 conference.
Simulation process Similarity analysis using cosine distance
Expected outcome of the process (obtained knowledge, analysis results, output of tools) Understand how well the semantics and syntactics of each sentence are encoded in the 300-dimensional space. See the shared task recap paper: https://arxiv.org/pdf/1707.08172.pdf
Anticipation for analyses/simulations other than the typical ones provided above

Other

Comments
What kind of data/tools do you wish to have?
Visualized information
Sample data

Comment form

captcha

Please check the terms of use here.

関連するトピック

関連するトピックはありません。