Modular networks of word correlations on Twitter
Joachim Mathiesen, Pernilly Yde, Mogens H. Jensen

TL;DR
This paper constructs and analyzes word correlation networks from Twitter data, revealing modular structures that reflect groups of similar entities across brands, nouns, and cities, highlighting the complex nature of information flow.
Contribution
It introduces a method to build and analyze word correlation networks from Twitter, demonstrating the modular organization of correlated words across different categories.
Findings
Heavy-tailed distribution of pair correlations
Modules of similar entities form distinct clusters
Correlation patterns differ from null models
Abstract
Complex networks are important tools for analyzing the information flow in many aspects of nature and human society. Using data from the microblogging service Twitter, we study networks of correlations in the appearance of words from three different categories, international brands, nouns and US major cities. We create networks where the strength of links is determined by a similarity measure based on the rate of coappearance of words. In comparison with the null model, where words are assumed to be uncorrelated, the heavy-tailed distribution of pair correlations is shown to be a consequence of modules of words representing similar entities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
