Automated Sentiment Classification and Topic Discovery in Large-Scale Social Media Streams
Yiwen Lu, Siheng Xiong, Zhaowei Li

TL;DR
This paper introduces a scalable framework for analyzing large-scale Twitter data by combining automated sentiment labeling, topic modeling, and interactive visualization to understand discourse in geopolitical contexts.
Contribution
It presents a novel pipeline integrating multiple pre-trained models, LDA-based topic discovery, and visualization tools for comprehensive social media analysis.
Findings
Effective sentiment classification across large datasets
Identification of latent themes related to geopolitical issues
Visualization of sentiment and topic trends over time and regions
Abstract
We present a framework for large-scale sentiment and topic analysis of Twitter discourse. Our pipeline begins with targeted data collection using conflict-specific keywords, followed by automated sentiment labeling via multiple pre-trained models to improve annotation robustness. We examine the relationship between sentiment and contextual features such as timestamp, geolocation, and lexical content. To identify latent themes, we apply Latent Dirichlet Allocation (LDA) on partitioned subsets grouped by sentiment and metadata attributes. Finally, we develop an interactive visualization interface to support exploration of sentiment trends and topic distributions across time and regions. This work contributes a scalable methodology for social media analysis in dynamic geopolitical contexts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
