From TF-IDF to Transformers: A Comparative and Ensemble Approach to Sentiment Classification
Dip Biswas Shanto, Mitali Yadav, Prajwal Panth, Suresh Chandra Satapathy

TL;DR
This paper compares traditional ML models and transformer-based models for sentiment analysis on movie reviews, demonstrating that RoBERTa outperforms others and that ensemble methods further improve accuracy.
Contribution
It provides a comprehensive comparison of ML and transformer models for sentiment classification and shows the effectiveness of ensemble approaches.
Findings
RoBERTa achieved 93.02% accuracy, outperforming other models.
Ensemble methods improved overall classification performance.
Transformer models are more effective than traditional ML models for sentiment analysis.
Abstract
Sentiment analysis, also referred to as opinion mining, primarily tries to extract opinion from any text-based data. In the context of movie reviews and critics, sentimental analysis can be a helpful tool to predict whether a movie review is generally positive or negative. It can be difficult for the ML models to understand the context or metaphysical sentiment accurately, as ML models rely largely on statistical word representations. The objective of this paper is to examine and categorise movie reviews into positive and negative sentiments. Diverse machine learning models are considered in doing so, and Natural Language Processing (NLP) methodologies are employed for data preprocessing and model assessment. The IMDb dataset is used. Specifically, Naive Bayes, Logistic Regression, Support Vector Machines (SVM), LightGBM, LSTM, and transformer-based models such as RoBERTa and DistilBERT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
