Twitter Sentiment Analysis: Lexicon Method, Machine Learning Method and Their Combination
Olga Kolchyna, Tharsis T. P. Souza, Philip Treleaven, Tomaso Aste

TL;DR
This paper compares lexicon-based and machine learning methods for Twitter sentiment analysis, demonstrating that combining these approaches and using cost-sensitive classifiers enhances classification accuracy.
Contribution
It introduces a new ensemble method combining lexicon scores with machine learning and evaluates performance improvements on a benchmark dataset.
Findings
Machine learning methods outperform lexicon-based methods.
Enhancing lexicons with social media slang improves accuracy.
Combined ensemble method yields more precise sentiment classification.
Abstract
This paper covers the two approaches for sentiment analysis: i) lexicon based method; ii) machine learning method. We describe several techniques to implement these approaches and discuss how they can be adopted for sentiment classification of Twitter messages. We present a comparative study of different lexicon combinations and show that enhancing sentiment lexicons with emoticons, abbreviations and social-media slang expressions increases the accuracy of lexicon-based classification for Twitter. We discuss the importance of feature generation and feature selection processes for machine learning sentiment classification. To quantify the performance of the main sentiment analysis methods over Twitter we run these algorithms on a benchmark Twitter dataset from the SemEval-2013 competition, task 2-B. The results show that machine learning method based on SVM and Naive Bayes classifiers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Spam and Phishing Detection
MethodsSupport Vector Machine
