Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets
Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde

TL;DR
This study systematically compares various sentiment analysis models across six diverse benchmark datasets, revealing insights into their generalization capabilities and achieving new state-of-the-art results on specific datasets.
Contribution
It provides a comprehensive evaluation of multiple models on diverse sentiment datasets, highlighting the effectiveness of Bi-LSTMs and the impact of sentiment-aware embeddings.
Findings
Bi-LSTMs perform well across datasets
LSTMs excel at fine-grained sentiment tasks
Sentiment-aware embeddings improve results on similar data
Abstract
There has been a good amount of progress in sentiment analysis over the past 10 years, including the proposal of new methods and the creation of benchmark datasets. In some papers, however, there is a tendency to compare models only on one or two datasets, either because of time restraints or because the model is tailored to a specific task. Accordingly, it is hard to understand how well a certain model generalizes across different tasks and datasets. In this paper, we contribute to this situation by comparing several models on six different benchmarks, which belong to different domains and additionally have different levels of granularity (binary, 3-class, 4-class and 5-class). We show that Bi-LSTMs perform well across datasets and that both LSTMs and Bi-LSTMs are particularly good at fine-grained sentiment tasks (i. e., with more than two classes). Incorporating sentiment information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
