TL;DR
DomainDemo is a comprehensive dataset linking shared domains on Twitter with user demographics, enabling analysis of online information flows and political discourse across different sociodemographic groups over a decade.
Contribution
It introduces a novel dataset connecting Twitter-shared domains with detailed demographic data, facilitating new insights into online information sharing and political trends.
Findings
Metrics align with existing classifications, validating the dataset.
The dataset covers over 129,000 websites from 2011 to 2022.
Provides insights into demographic-based online information dissemination.
Abstract
Social media play a pivotal role in disseminating web content, particularly during elections, yet our understanding of the association between demographic factors and information sharing online remains limited. Here, we introduce a unique dataset, DomainDemo, linking domains shared on Twitter (X) with the demographic characteristics of associated users, including age, gender, race, political affiliation, and geolocation, from 2011 to 2022. This new resource was derived from a panel of over 1.5 million Twitter users matched against their U.S. voter registration records, facilitating a better understanding of a decade of information flows on one of the most prominent social media platforms and trends in political and public discourse among registered U.S. voters from different sociodemographic groups. By aggregating user demographic information onto the domains, we derive five metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
