Where did you tweet from? Inferring the origin locations of tweets based on contextual information
Rabindra Lamsal, Aaron Harwood, Maria Rodriguez Read

TL;DR
This paper presents the True Origin Model, a machine learning framework that infers the likely origin locations of tweets using natural language understanding, achieving promising accuracy across multiple geographic levels despite limited geotagged data.
Contribution
The work introduces the True Origin Model and locBERT, novel tools for inferring tweet origins, addressing the Location A/B problem with improved accuracy and a new dataset for future research.
Findings
Achieves 80% accuracy at country level
Achieves 58% accuracy at city level
Highlights issues with current ground truth methodologies
Abstract
Public conversations on Twitter comprise many pertinent topics including disasters, protests, politics, propaganda, sports, climate change, epidemics/pandemic outbreaks, etc., that can have both regional and global aspects. Spatial discourse analysis rely on geographical data. However, today less than 1% of tweets are geotagged; in both cases--point location or bounding place information. A major issue with tweets is that Twitter users can be at location A and exchange conversations specific to location B, which we call the Location A/B problem. The problem is considered solved if location entities can be classified as either origin locations (Location As) or non-origin locations (Location Bs). In this work, we propose a simple yet effective framework--the True Origin Model--to address the problem that uses machine-level natural language understanding to identify tweets that conceivably…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Human Mobility and Location-Based Analysis · Data-Driven Disease Surveillance
MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Weight Decay · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Adam
