Subjectivity datasets of Movie Review Data
- Last Update:November,24,2015 Created:November,24,2015
- Comment
- Like
- Favorite
Public
Profile
Title of the dataset | Subjectivity datasets of Movie Review Data |
---|---|
Provenance of the dataset | http://www.cs.cornell.edu/people/pabo/movie-review-data/ |
How were the data collected/created? What was the cost? | The starting points for data acquisition were snippets of movie reviews from Rotten Tomatoes (http://www.rottentomatoes.com/) and plot summaries for movies from the Internet Movie Database (http://www.imdb.com). The html files under the folder subj are webpages containing Rotten Tomatoes snippets; the html files under the folder obj are pages containing IMDb plot summaries. |
Data sharing policy | Other |
Data sharing policy |
About data analysis and simulation
Type of data: Check all that apply. Use "Other" to specify other types so that we can include them in further updates. | text |
---|---|
Variable labels of dataset (the names of the variables) | OBJECTIVE SENTENCES(TEXT)|SUBJECTIVE SENTENCES(TEXT) |
Outline of data | This data is distributed as movie-review data for use in sentiment-analysis experiments, which includes 5000 subjective and 5000 objective processed sentences. Introduced in Pang/Lee ACL 2004 (Released June 2004). Available are collections of movie-review documents labeled with respect to their overall sentiment polarity (positive or negative) or subjective rating (e.g., two and a half stars) and sentences labeled with respect to their subjectivity status (subjective or objective) or polarity. |
Simulation process | sentiment classification/summarization/sentiment categorization/sentiment polarity/rating evaluation |
Expected outcome of the process (obtained knowledge, analysis results, output of tools) | polarity of reviews/summarized reviews/clusters/categories |
Anticipation for analyses/simulations other than the typical ones provided above |
Other
Comments | Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning Techniques, Proceedings of EMNLP 2002. Bo Pang and Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, Proceedings of ACL 2004. Bo Pang and Lillian Lee, Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of ACL 2005. |
---|---|
What kind of data/tools do you wish to have? | |
Visualized information | |
Sample data |
Comment form