Yelp Dataset Challenge: Review Rating Prediction
Nabiha Asghar

TL;DR
This paper explores various machine learning models and feature extraction techniques to predict star ratings from Yelp reviews, aiming to identify the most effective approach for review rating prediction.
Contribution
It systematically compares sixteen models combining different features and algorithms for review rating prediction using Yelp data.
Findings
Identified the best model among the tested combinations.
Demonstrated the effectiveness of certain feature-algorithm pairs.
Provided insights into feature importance for rating prediction.
Abstract
Review websites, such as TripAdvisor and Yelp, allow users to post online reviews for various businesses, products and services, and have been recently shown to have a significant influence on consumer shopping behaviour. An online review typically consists of free-form text and a star rating out of 5. The problem of predicting a user's star rating for a product, given the user's text review for that product, is called Review Rating Prediction and has lately become a popular, albeit hard, problem in machine learning. In this paper, we treat Review Rating Prediction as a multi-class classification problem, and build sixteen different prediction models by combining four feature extraction methods, (i) unigrams, (ii) bigrams, (iii) trigrams and (iv) Latent Semantic Indexing, with four machine learning algorithms, (i) logistic regression, (ii) Naive Bayes classification, (iii) perceptrons,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Digital Marketing and Social Media · Recommender Systems and Techniques
