Social Media Text Processing and Semantic Analysis for Smart Cities
Jo\~ao Filipe Figueiredo Pereira

TL;DR
This paper presents a framework for processing and analyzing geo-located social media tweets in multiple languages to extract insights relevant to smart cities and transportation systems, including topic modeling and classification.
Contribution
It introduces a comprehensive framework for social media data collection, filtering, multilingual text processing, and classification tailored for smart city applications.
Findings
Shared topics across different cities despite demographic differences
Word embeddings improve classification robustness over traditional methods
Large-scale analysis of 43 million tweets over three months
Abstract
With the rise of Social Media, people obtain and share information almost instantly on a 24/7 basis. Many research areas have tried to gain valuable insights from these large volumes of freely available user generated content. With the goal of extracting knowledge from social media streams that might be useful in the context of intelligent transportation systems and smart cities, we designed and developed a framework that provides functionalities for parallel collection of geo-located tweets from multiple pre-defined bounding boxes (cities or regions), including filtering of non-complying tweets, text pre-processing for Portuguese and English language, topic modeling, and transportation-specific text classifiers, as well as, aggregation and data visualization. We performed an exploratory data analysis of geo-located tweets in 5 different cities: Rio de Janeiro, S\~ao Paulo, New York…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
