Digital trace data collection through data donation
Laura Boeschoten, Jef Ausloos, Judith Moeller, Theo Araujo, and Daniel L. Oberski

TL;DR
This paper explores the use of GDPR-mandated data download packages (DDPs) for social science research, providing a blueprint and error framework to ensure data quality and representativeness in this new data collection method.
Contribution
It introduces a comprehensive blueprint and a total error framework for collecting and analyzing digital trace data via DDPs, addressing methodological challenges and quality control.
Findings
Proposes a systematic error framework for DDP-based data collection.
Provides a practical quality control checklist for researchers.
Highlights the potential of DDPs for large-scale social science research.
Abstract
A potentially powerful method of social-scientific data collection and investigation has been created by an unexpected institution: the law. Article 15 of the EU's 2018 General Data Protection Regulation (GDPR) mandates that individuals have electronic access to a copy of their personal data, and all major digital platforms now comply with this law by providing users with "data download packages" (DDPs). Through voluntary donation of DDPs, all data collected by public and private entities during the course of citizens' digital life can be obtained and analyzed to answer social-scientific questions - with consent. Thus, consented DDPs open the way for vast new research opportunities. However, while this entirely new method of data collection will undoubtedly gain popularity in the coming years, it also comes with its own questions of representativeness and measurement quality, which are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Data Storage Technologies · Data Quality and Management
