Predicting health inspection results from online restaurant reviews

Samantha Wong; Hamidreza Chinaei; Frank Rudzicz

arXiv:1603.05673·cs.CL·March 21, 2016·2 cites

Predicting health inspection results from online restaurant reviews

Samantha Wong, Hamidreza Chinaei, Frank Rudzicz

PDF

Open Access

TL;DR

This paper demonstrates that linguistic features extracted from online restaurant reviews can accurately predict official health inspection results with over 90% accuracy, leveraging simple classification methods.

Contribution

It introduces a novel approach of using linguistic analytics on online reviews to predict health inspection outcomes, combining keyword and topic features.

Findings

01

Achieved over 90% prediction accuracy.

02

Linguistic features effectively predict health inspections.

03

Simple SVM classifiers are sufficient for high accuracy.

Abstract

Informatics around public health are increasingly shifting from the professional to the public spheres. In this work, we apply linguistic analytics to restaurant reviews, from Yelp, in order to automatically predict official health inspection reports. We consider two types of feature sets, i.e., keyword detection and topic model features, and use these in several classification methods. Our empirical analysis shows that these extracted features can predict public health inspection reports with over 90% accuracy using simple support vector machines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Advanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining