Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
Zefang Liu

TL;DR
This paper compares traditional machine learning and advanced transformer-based models for predicting Yelp restaurant ratings from reviews, demonstrating that XLNet outperforms other models with 70% accuracy.
Contribution
It systematically evaluates multiple models, including transformers, for review rating prediction, highlighting the effectiveness of XLNet over traditional methods.
Findings
XLNet achieves 70% accuracy in 5-star classification.
Transformer models outperform traditional machine learning models.
Balanced dataset and feature engineering improve model performance.
Abstract
We predict restaurant ratings from Yelp reviews based on Yelp Open Dataset. Data distribution is presented, and one balanced training dataset is built. Two vectorizers are experimented for feature engineering. Four machine learning models including Naive Bayes, Logistic Regression, Random Forest, and Linear Support Vector Machine are implemented. Four transformer-based models containing BERT, DistilBERT, RoBERTa, and XLNet are also applied. Accuracy, weighted F1 score, and confusion matrix are used for model evaluation. XLNet achieves 70% accuracy for 5-star classification compared with Logistic Regression with 64% accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
MethodsLinear Layer · Linear Warmup With Linear Decay · Attention Is All You Need · Byte Pair Encoding · Layer Normalization · SentencePiece · Dropout · Weight Decay · Dense Connections · Logistic Regression
