A Cross-Platform Collection of Social Network Profiles
Maria Han Veiga, Carsten Eickhoff

TL;DR
This paper introduces a comprehensive dataset of 850 users across Twitter, Instagram, and Foursquare, enabling research on privacy risks, user de-anonymization, and cross-platform social media analysis.
Contribution
It provides a structured, multi-platform social network dataset with detailed user footprints, facilitating studies on privacy hazards and user identification techniques.
Findings
Dataset includes over 2.5 million tweets and 340,000 check-ins.
Supports research on privacy and de-anonymization across platforms.
Methodology for data collection and potential use cases discussed.
Abstract
The proliferation of Internet-enabled devices and services has led to a shifting balance between digital and analogue aspects of our everyday lives. In the face of this development there is a growing demand for the study of privacy hazards, the potential for unique user de-anonymization and information leakage between the various social media profiles many of us maintain. To enable the structured study of such adversarial effects, this paper presents a dedicated dataset of cross-platform social network personas (i.e., the same person has accounts on multiple platforms). The corpus comprises 850 users who generate predominantly English content. Each user object contains the online footprint of the same person in three distinct social networks: Twitter, Instagram and Foursquare. In total, it encompasses over 2.5M tweets, 340k check-ins and 42k Instagram posts. We describe the collection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
