What Should LLMs Forget? Quantifying Personal Data in LLMs for Right-to-Be-Forgotten Requests
Dimitri Staufer

TL;DR
This paper introduces a dataset and metric to identify and quantify personal data memorized by LLMs, facilitating compliance with GDPR's Right to Be Forgotten at the individual level.
Contribution
It presents WikiMem, a new dataset, and a model-agnostic metric for detecting personal data in LLMs, enabling targeted unlearning and privacy compliance.
Findings
Memorization correlates with web presence and model size.
The metric effectively ranks factual associations in LLMs.
Evaluation across multiple models demonstrates practical applicability.
Abstract
Large Language Models (LLMs) can memorize and reveal personal information, raising concerns regarding compliance with the EU's GDPR, particularly the Right to Be Forgotten (RTBF). Existing machine unlearning methods assume the data to forget is already known but do not address how to identify which individual-fact associations are stored in the model. Privacy auditing techniques typically operate at the population level or target a small set of identifiers, limiting applicability to individual-level data inquiries. We introduce WikiMem, a dataset of over 5,000 natural language canaries covering 243 human-related properties from Wikidata, and a model-agnostic metric to quantify human-fact associations in LLMs. Our approach ranks ground-truth values against counterfactuals using calibrated negative log-likelihood across paraphrased prompts. We evaluate 200 individuals across 15 LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Artificial Intelligence in Law
MethodsCounterfactuals Explanations · Sparse Evolutionary Training
