Clustering memes in social media streams
Mohsen JafariAsbagh, Emilio Ferrara, Onur Varol, Filippo Menczer,, Alessandro Flammini

TL;DR
This paper presents a streaming framework for real-time detection and clustering of memes on Twitter, combining content, social network, and diffusion data to identify trending topics effectively.
Contribution
The authors introduce a novel online clustering method that integrates multiple data dimensions and employs a memory mechanism to adapt to evolving social media content.
Findings
Outperforms content-only baseline algorithms.
Outperforms state-of-the-art event detection methods.
Effective in recovering trending hashtags over time.
Abstract
The problem of clustering content in social media has pervasive applications, including the identification of discussion topics, event detection, and content recommendation. Here we describe a streaming framework for online detection and clustering of memes in social media, specifically Twitter. A pre-clustering procedure, namely protomeme detection, first isolates atomic tokens of information carried by the tweets. Protomemes are thereafter aggregated, based on multiple similarity measures, to obtain memes as cohesive groups of tweets reflecting actual concepts or topics of discussion. The clustering algorithm takes into account various dimensions of the data and metadata, including natural language, the social network, and the patterns of information diffusion. As a result, our system can build clusters of semantically, structurally, and topically related tweets. The clustering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
