Assessment of Effectiveness of Content Models for Approximating Twitter Social Connection Structures
Kuntal Dey, Sahil Agrawal, Rahul Malviya, Saroj Kaushik

TL;DR
This study evaluates how well different content models like unigram, bigram, and LDA can approximate Twitter social connection structures based on user-generated content, revealing that unigram models best preserve community properties.
Contribution
The paper provides an empirical comparison of content models for social link prediction on Twitter, highlighting the superior performance of unigram models in maintaining community structures.
Findings
Unigram models outperform bigram and LDA in preserving social communities.
Finer-grained semantic content improves link prediction accuracy.
Word usage within communities converges, supporting community-specific language evolution.
Abstract
This paper explores the social quality (goodness) of community structures formed across Twitter users, where social links within the structures are estimated based upon semantic properties of user-generated content (corpus). We examined the overlap of the community structures of the constructed graphs, and followership-based social communities, to find the social goodness of the links constructed. Unigram, bigram and LDA content models were empirically investigated for evaluation of effectiveness, as approximators of underlying social graphs, such that they maintain the {\it community} social property. Impact of content at varying granularities, for the purpose of predicting links while retaining the social community structures, was investigated. 100 discussion topics, spanning over 10 Twitter events, were used for experiments. The unigram language model performed the best, indicating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Misinformation and Its Impacts · Social Media and Politics
