Improved Topic modeling in Twitter through Community Pooling
Federico Albanese, Esteban Feuerstein

TL;DR
This paper introduces a community-based tweet pooling method that enhances topic modeling in Twitter data by grouping tweets from user communities, improving accuracy and efficiency over existing methods.
Contribution
The paper proposes a novel community pooling scheme for Twitter topic modeling, outperforming previous pooling methods in quality and speed.
Findings
Community pooling improves topic cluster quality.
Method reduces computational time.
Outperforms state-of-the-art schemes on multiple datasets.
Abstract
Social networks play a fundamental role in propagation of information and news. Characterizing the content of the messages becomes vital for different tasks, like breaking news detection, personalized message recommendation, fake users detection, information flow characterization and others. However, Twitter posts are short and often less coherent than other text documents, which makes it challenging to apply text mining algorithms to these datasets efficiently. Tweet-pooling (aggregating tweets into longer documents) has been shown to improve automatic topic decomposition, but the performance achieved in this task varies depending on the pooling method. In this paper, we propose a new pooling scheme for topic modeling in Twitter, which groups tweets whose authors belong to the same community (group of users who mainly interact with each other but not with other groups) on a user…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Expert finding and Q&A systems · Web Data Mining and Analysis
