Predicting a Business Star in Yelp from Its Reviews Text Alone
Mingming Fan, Maryam Khademi

TL;DR
This paper develops a machine learning approach to predict Yelp business ratings solely from review texts, aiming to provide unbiased overviews and reduce subjectivity in ratings.
Contribution
It introduces a novel combination of feature extraction methods and machine learning models to accurately predict business ratings from review texts.
Findings
Achieved RMSE of 0.6 with Linear Regression using top frequent words or adjectives.
Demonstrated effectiveness of POS-based feature selection in rating prediction.
Provided a method to generate unbiased business ratings from textual reviews.
Abstract
Yelp online reviews are invaluable source of information for users to choose where to visit or what to eat among numerous available options. But due to overwhelming number of reviews, it is almost impossible for users to go through all reviews and find the information they are looking for. To provide a business overview, one solution is to give the business a 1-5 star(s). This rating can be subjective and biased toward users personality. In this paper, we predict a business rating based on user-generated reviews texts alone. This not only provides an overview of plentiful long review texts but also cancels out subjectivity. Selecting the restaurant category from Yelp Dataset Challenge, we use a combination of three feature generation methods as well as four machine learning models to find the best prediction result. Our approach is to create bag of words from the top frequent words in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Spam and Phishing Detection
