Understanding and Benchmarking the Impact of GDPR on Database Systems
Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, Vijay, Chidambaram

TL;DR
This paper analyzes GDPR's impact on database systems, introduces GDPRbench for benchmarking compliance, and evaluates the performance of modified systems, revealing significant scalability and performance challenges.
Contribution
It provides a systematic translation of GDPR requirements into database capabilities, introduces GDPRbench benchmark, and assesses real-world system performance under GDPR constraints.
Findings
Metadata explosion significantly affects storage requirements.
GDPR-compliant systems show poor performance on workloads.
Performance degrades as personal data volume increases.
Abstract
The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
