Opinion mining and sentiment analysis
Bo Pang and Lillian Lee
Foundations and Trends in Information Retrieval 2(1-2), pp. 1–135, 2008.
Also available as a book or e-book.
http://www.cs.cornell.edu/home/llee/opinion-mining-sentiment-analysis-survey.html
The monograph itself:
- published version
- authors-formatted version: slight differences from the final print version: copy-editing and typesetting changes (print version has one known introduced typo); has fewer pages than the published version due to tighter formatting; includes hyperref links for navigation and full first names in the bibliography.
- Now Publishers website: html version with hyperlinks to all references for series subscribers, and order forms for book or e-book versions (use promotion codeINR002001 for a 50% discount).
Note also the description of the publisher’s philosophy and benefits of their approach. If you would like to support a publisher that allows authors to post full content for free on their homepage as a matter of course (and you believe that content is of value) please consider asking your institution to purchase or subscribe to that publisher’s products.
Please note that we authors do not receive any royalties from purchases.
The publishers offer special pricing if 10+ copies are required for use in a course or otherwise.
- Amazon site
Bibliography:
- bibtex file: original except for addition of the survey itself (336 references total)
- search ‘n sort interface
Associated slides:
- South by SouthWest (SXSW) Interactive 2011 talk slides: a search-oriented overview, with about half based on this monograph.
- AAAI 2008 invited talk slides, based for the most part on this monograph.
- ICWSM 2009 version (includes discussion of a WWW’09 paper)
Abstract:
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object.
This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, vulnerability to manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.
Mentions (roughly chronological order):
“Congratulations”, Matthew Hurst | “THE survey to read”, “tremendous resource”, Jeffrey Carr | “Excellent”, George Tziralis | “excellent points”, Jessica Hullman | “more than a must”, José María Gómez Hidalgo | “excellent and very comprehensive”, Philip Resnik | “excellent and comprehensive survey”, Nikolay Archak, Anindya Ghose, and Panagiotis G. Ipeirotis | “a gold mine”, Jaylan Turkkan | “definitive monograph” Seth Grimes, who also wrote a practitioners’-perspective mini-review | “entertaining … excellent and timely”, Shlomo Argamon, Computational Linguistics brief review | linked to under anchor text “science” of sentiment by Discover Magazine’s blog and named by an article on sentiment analysis in the New York Times.
Textbook for the following courses: Social Media Analysis, William Cohen, CMU Spring 2010; Computational linguistics II: opinion mining and sentiment analysis, Hyopil Shin, Seoul National University, Spring 2009
Table of Contents:
- Introduction
- The demand for information on opinions and sentiment
- What might be involved? An example examination of the construction of an opinion/review search engine
- Our charge and approach
- Early history
- A note on terminology: Opinion mining, sentiment analysis, subjectivity, and all that
- Applications
- Applications to review-related websites
- Applications as a sub-component technology
- Applications in business and government intelligence
- Applications across different domains
- General Challenges
- Contrasts with standard fact-based textual analysis
- Factors that make opinion mining difficult
- Classification and Extraction
Part One: Fundamentals
- Problem formulations and key concepts
- Sentiment polarity and degrees of positivity
- Subjectivity detection and opinion identification
- Joint topic-sentiment analysis
- Viewpoints and perspectives
- Other non-factual information in text
- Features
- Term presence vs. frequency
- Term-based features beyond term unigrams
- Parts of speech
- Syntax
- Negation
- Topic-oriented features
Part Two: Approaches
- The impact of labeled data
- Domain adaptation and topic-sentiment interaction
- Domain considerations
- Topic (and sub-topic or feature) considerations
- Unsupervised approaches
- Unsupervised lexicon induction
- Other unsupervised approaches
- Classification based on relationship information
- Relationships between sentences and between documents
- Relationships between discourse participants
- Relationships between product features
- Relationships between classes
- Incorporating discourse structure
- Language models
- Special considerations for extraction
- Identifying product features and opinions in reviews
- Problems involving opinion holders
- Summarization
- Single-document opinion-oriented summarization
- Multi-document opinion-oriented summarization
- Some problem considerations
- Textual summaries
- Non-textual summaries
- Review(er) quality
- Broader Implications
- Economic impact of reviews
- Surveys summarizing relevant economic literature
- Economic-impact studies employing automated text analysis
- Interactions with word of mouth (WOM)
- Implications for manipulation
- Economic impact of reviews
- Publicly Available Resources
- Datasets
- Acquiring labels for data
- An annotated list of datasets
- Evaluation campaigns
- TREC opinion-related competitions
- NTCIR opinion-related competitions
- Lexical resources
- Tutorials, bibliographies, and other references
- Datasets
- Concluding Remarks
- References
Lillian Lee’s home page | Lillian Lee’s co-authored papers on sentiment analysis