Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Martin Kuo, Jingyang Zhang, Jianyi Zhang, Minxue Tang, Louis, DiValentin, Aolin Ding, Jingwei Sun, William Chen, Amin Hass, Tianlong Chen,, Yiran Chen, Hai Li

TL;DR
This paper introduces Proactive Privacy Amnesia, a novel method inspired by cognitive science, to protect PII in large language models by actively forgetting sensitive information with minimal impact on model utility.
Contribution
The paper proposes a new proactive forgetting mechanism for LLMs that effectively safeguards PII while preserving model performance, outperforming existing methods.
Findings
Complete elimination of phone number exposure
Significant reduction in address exposure (9.8% - 87.6%)
Maintains comparable model utility
Abstract
With the rise of large language models (LLMs), increasing research has recognized their risk of leaking personally identifiable information (PII) under malicious attacks. Although efforts have been made to protect PII in LLMs, existing methods struggle to balance privacy protection with maintaining model utility. In this paper, inspired by studies of amnesia in cognitive science, we propose a novel approach, Proactive Privacy Amnesia (PPA), to safeguard PII in LLMs while preserving their utility. This mechanism works by actively identifying and forgetting key memories most closely associated with PII in sequences, followed by a memory implanting using suitable substitute memories to maintain the LLM's functionality. We conduct evaluations across multiple models to protect common PII, such as phone numbers and physical addresses, against prevalent PII-targeted attacks, demonstrating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
