Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis
Xiaoxia Zhang, Xiuyuan Qi, Zixin Teng

TL;DR
This study evaluates various machine learning and NLP models, including transformer-based ones, for sentiment analysis on Reddit comments using the GoEmotions dataset, highlighting RoBERTa's superior performance.
Contribution
It expands prior work by assessing a diverse set of models and evaluation criteria, including hierarchical classification and efficiency, on a large emotion-labeled Reddit comment dataset.
Findings
RoBERTa outperforms traditional classifiers in accuracy
Hierarchical classification improves emotion detection granularity
Computational efficiency varies across models
Abstract
Sentiment analysis, an increasingly vital field in both academia and industry, plays a pivotal role in machine learning applications, particularly on social media platforms like Reddit. However, the efficacy of sentiment analysis models is hindered by the lack of expansive and fine-grained emotion datasets. To address this gap, our study leverages the GoEmotions dataset, comprising a diverse range of emotions, to evaluate sentiment analysis methods across a substantial corpus of 58,000 comments. Distinguished from prior studies by the Google team, which limited their analysis to only two models, our research expands the scope by evaluating a diverse array of models. We investigate the performance of traditional classifiers such as Naive Bayes and Support Vector Machines (SVM), as well as state-of-the-art transformer-based models including BERT, RoBERTa, and GPT. Furthermore, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Dropout · Dense Connections · Softmax · RoBERTa · Layer Normalization · BERT
