Autodetection and Classification of Hidden Cultural City Districts from   Yelp Reviews

Harini Suresh; Nicholas Locascio

arXiv:1501.02527·cs.CL·January 13, 2015·2 cites

Autodetection and Classification of Hidden Cultural City Districts from Yelp Reviews

Harini Suresh, Nicholas Locascio

PDF

Open Access

TL;DR

This paper employs topic modeling and clustering techniques on Yelp reviews to identify and classify both known and hidden cultural districts within cities, enhancing understanding of urban cultural landscapes.

Contribution

It introduces a combined approach using LDA and clustering methods to detect and visualize hidden cultural districts from review data.

Findings

01

Successfully identified known cultural districts like Chinatown.

02

Discovered hidden or less obvious districts based on review patterns.

03

Provided a visual map-based representation of districts and their similarities.

Abstract

Topic models are a way to discover underlying themes in an otherwise unstructured collection of documents. In this study, we specifically used the Latent Dirichlet Allocation (LDA) topic model on a dataset of Yelp reviews to classify restaurants based off of their reviews. Furthermore, we hypothesize that within a city, restaurants can be grouped into similar "clusters" based on both location and similarity. We used several different clustering methods, including K-means Clustering and a Probabilistic Mixture Model, in order to uncover and classify districts, both well-known and hidden (i.e. cultural areas like Chinatown or hearsay like "the best street for Italian restaurants") within a city. We use these models to display and label different clusters on a map. We also introduce a topic similarity heatmap that displays the similarity distribution in a city to a new restaurant.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Text and Document Classification Technologies · Video Analysis and Summarization

MethodsHeatmap · k-Means Clustering