A Comparative Analysis of Content-based Geolocation in Blogs and Tweets
Konstantinos Pappas, Mahmoud Azab, Rada Mihalcea

TL;DR
This paper compares text-based geolocation methods on Blogger and Twitter, introducing new features that improve accuracy and analyzing factors affecting geolocability across different social media platforms and user demographics.
Contribution
It introduces novel location-specific features for geolocation, compares performance across Blogger and Twitter, and investigates cross-media and demographic effects on geolocability.
Findings
Error rate reduced by up to 12.5% with new features
Blogger users are harder to geolocate than Twitter users
Cross-media and demographic factors influence geolocability
Abstract
The geolocation of online information is an essential component in any geospatial application. While most of the previous work on geolocation has focused on Twitter, in this paper we quantify and compare the performance of text-based geolocation methods on social media data drawn from both Blogger and Twitter. We introduce a novel set of location specific features that are both highly informative and easily interpretable, and show that we can achieve error rate reductions of up to 12.5% with respect to the best previously proposed geolocation features. We also show that despite posting longer text, Blogger users are significantly harder to geolocate than Twitter users. Additionally, we investigate the effect of training and testing on different media (cross-media predictions), or combining multiple social media sources (multi-media predictions). Finally, we explore the geolocability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Expert finding and Q&A systems · Human Mobility and Location-Based Analysis
