Sentiment scale datasets of Movie Review Data
- Last Update:November,24,2015 Created:November,24,2015
- Comment
- Like
- Favorite
Public
Profile
Title of the dataset | Sentiment scale datasets of Movie Review Data |
---|---|
Provenance of the dataset | http://www.cs.cornell.edu/people/pabo/movie-review-data/ |
How were the data collected/created? What was the cost? | Authors automatically tokenized and applied pattern matching technique to remove explicit rating indications from the reviews. Subjective sentences were automatically identified using the system described in their 2004 ACL paper (http://www.cs.cornell.edu/home/llee/papers/cutsent.home.html). |
Data sharing policy | Other |
Data sharing policy |
About data analysis and simulation
Type of data: Check all that apply. Use "Other" to specify other types so that we can include them in further updates. | text number |
---|---|
Variable labels of dataset (the names of the variables) | AUTHOR NAME|THREE CLASS|NORMALIZED NUMERICAL RATING|FOUR CLASS|REVIEW TEXT(DOCUMENT)|CLASS LABELS|ID OF THE SOURCE HTML FILE|NUMERICAL RATING |
Outline of data | This data is distributed as movie-review data for use in sentiment-analysis experiments, which includes a collection of documents whose labels come from a rating scale. Introduced in Pang/Lee ACL 2005 (Released July 2005). There are two files, roughly corresponding to (1) the reviews after pre-processing, including subjectivity extraction (i.e., the data we used in our experiments) and (2) the reviews after very light pre-processing. Available are collections of movie-review documents labeled with respect to their overall sentiment polarity (positive or negative) or subjective rating (e.g., two and a half stars) and sentences labeled with respect to their subjectivity status (subjective or objective) or polarity. |
Simulation process | sentiment classification/summarization/sentiment categorization/sentiment polarity/rating evaluation |
Expected outcome of the process (obtained knowledge, analysis results, output of tools) | polarity of reviews/summarized reviews/clusters/categories |
Anticipation for analyses/simulations other than the typical ones provided above |
Other
Comments | Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning Techniques, Proceedings of EMNLP 2002. Bo Pang and Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, Proceedings of ACL 2004. Bo Pang and Lillian Lee, Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of ACL 2005. |
---|---|
What kind of data/tools do you wish to have? | |
Visualized information | |
Sample data |
Comment form