Characterizing Geo-located Tweets in Brazilian Megacities
Jo\~ao Pereira, Arian Pasquali, Pedro Saleiro, Rosaldo Rossetti,, N\'elio Cacho

TL;DR
This study analyzes over 9 million geo-located tweets from Brazil's largest cities to identify common and city-specific topics, providing insights into urban social dynamics for smart city applications.
Contribution
It introduces a framework for collecting and analyzing geo-located tweets, applying topic modeling to reveal urban social interests and differences in Rio de Janeiro and São Paulo.
Findings
Identified 29 distinct topics across both cities.
Found significant similarities in social interests between the cities.
Detected city-specific predominant topics.
Abstract
This work presents a framework for collecting, processing and mining geo-located tweets in order to extract meaningful and actionable knowledge in the context of smart cities. We collected and characterized more than 9M tweets from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We performed topic modeling using the Latent Dirichlet Allocation model to produce an unsupervised distribution of semantic topics over the stream of geo-located tweets as well as a distribution of words over those topics. We manually labeled and aggregated similar topics obtaining a total of 29 different topics across both cities. Results showed similarities in the majority of topics for both cities, reflecting similar interests and concerns among the population of Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more predominant in one of the cities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
