Describing and Understanding Neighborhood Characteristics through Online   Social Media

Mohamed Kafsi; Henriette Cramer; Bart Thomee; David A. Shamma

arXiv:1503.03524·stat.ML·March 13, 2015

Describing and Understanding Neighborhood Characteristics through Online Social Media

Mohamed Kafsi, Henriette Cramer, Bart Thomee, David A. Shamma

PDF

TL;DR

This paper introduces the geographical hierarchy model (GHM), a probabilistic approach that leverages geotagged social media data to identify and compare region-specific content, improving classification accuracy over traditional methods.

Contribution

The paper presents the GHM, a novel probabilistic model that distinguishes local from general content in geotagged data, enhancing regional characterization and comparison capabilities.

Findings

01

GHM improves classification accuracy by 47% over Naive Bayes.

02

GHM outperforms hierarchical TF-IDF by 27%.

03

Model effectively identifies region-specific content and compares regions.

Abstract

Geotagged data can be used to describe regions in the world and discover local themes. However, not all data produced within a region is necessarily specifically descriptive of that area. To surface the content that is characteristic for a region, we present the geographical hierarchy model (GHM), a probabilistic model based on the assumption that data observed in a region is a random mixture of content that pertains to different levels of a hierarchy. We apply the GHM to a dataset of 8 million Flickr photos in order to discriminate between content (i.e., tags) that specifically characterizes a region (e.g., neighborhood) and content that characterizes surrounding areas or more general themes. Knowledge of the discriminative and non-discriminative terms used throughout the hierarchy enables us to quantify the uniqueness of a given region and to compare similar but distant regions. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.