Towards Real-Time, Country-Level Location Classification of Worldwide   Tweets

Arkaitz Zubiaga; Alex Voss; Rob Procter; Maria Liakata; Bo Wang; Adam; Tsakalidis

arXiv:1604.07236·cs.IR·April 26, 2017

Towards Real-Time, Country-Level Location Classification of Worldwide Tweets

Arkaitz Zubiaga, Alex Voss, Rob Procter, Maria Liakata, Bo Wang, Adam, Tsakalidis

PDF

1 Repo

TL;DR

This paper explores real-time classification of worldwide tweets at the country level using tweet-inherent features, demonstrating that combining content and metadata improves accuracy and that models trained on historical data can be effective over time.

Contribution

It introduces a comprehensive approach for global tweet country classification using inherent features and evaluates the temporal robustness of trained models.

Findings

01

Combining tweet content and metadata improves classification accuracy by 20-50%.

02

Content, self-reported location, and real name are highly useful features.

03

Models trained on historical data can classify new tweets effectively without retraining.

Abstract

In contrast to much previous work that has focused on location classification of tweets restricted to a specific country, here we undertake the task in a broader context by classifying global tweets at the country level, which is so far unexplored in a real-time scenario. We analyse the extent to which a tweet's country of origin can be determined by making use of eight tweet-inherent features for classification. Furthermore, we use two datasets, collected a year apart from each other, to analyse the extent to which a model trained from historical tweets can still be leveraged for classification of new tweets. With classification experiments on all 217 countries in our datasets, as well as on the top 25 countries, we offer some insights into the best use of tweet-inherent features for an accurate country-level classification of tweets. We find that the use of a single feature, such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MALHARULHAS/A-Country_level-location-classification-system-for-twitter-tweets-from-the-whole-world
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.