Evolution of repositories and privacy laws: commit activities in the GDPR and CCPA era
Georgia M. Kapitsaki, Maria Papoutsoglou

TL;DR
This study analyzes how open source repositories on GitHub have evolved in response to GDPR, CCPA, and related privacy laws, highlighting commit patterns and the need for better privacy compliance tools.
Contribution
It provides the first large-scale analysis of commit activities related to privacy laws in open source repositories, revealing trends and gaps in privacy compliance efforts.
Findings
Most commits occurred in the year laws took effect
Privacy terms appear in commit messages
References to specific user rights are scarce
Abstract
Free and open source software has gained a lot of momentum in the industry and the research community. The latest advances in privacy legislation, including the EU General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), have forced the community to pay special attention to users' data privacy. The main aim of this work is to examine software repositories that are acting on privacy laws. We have collected commit data from GitHub repositories in order to understand indications on main data privacy laws (GDPR, CCPA, CPRA, UK DPA) in the last years. Via an automated process, we analyzed 37,213 commits from 12,391 repositories since 2016, whereas 594 commits from the 70 most popular repositories of the dataset were manually analyzed. We observe that most commits were performed on the year the law came into effect and privacy relevant terms appear in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Ethics and Social Impacts of AI · Scientific Computing and Data Management
