Predicting health inspection results from online restaurant reviews
Samantha Wong, Hamidreza Chinaei, Frank Rudzicz

TL;DR
This paper demonstrates that linguistic features extracted from online restaurant reviews can accurately predict official health inspection results with over 90% accuracy, leveraging simple classification methods.
Contribution
It introduces a novel approach of using linguistic analytics on online reviews to predict health inspection outcomes, combining keyword and topic features.
Findings
Achieved over 90% prediction accuracy.
Linguistic features effectively predict health inspections.
Simple SVM classifiers are sufficient for high accuracy.
Abstract
Informatics around public health are increasingly shifting from the professional to the public spheres. In this work, we apply linguistic analytics to restaurant reviews, from Yelp, in order to automatically predict official health inspection reports. We consider two types of feature sets, i.e., keyword detection and topic model features, and use these in several classification methods. Our empirical analysis shows that these extracted features can predict public health inspection reports with over 90% accuracy using simple support vector machines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Advanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining
