Proactive Privacy Amnesia for Large Language Models: Safeguarding PII   with Negligible Impact on Model Utility

Martin Kuo; Jingyang Zhang; Jianyi Zhang; Minxue Tang; Louis; DiValentin; Aolin Ding; Jingwei Sun; William Chen; Amin Hass; Tianlong Chen,; Yiran Chen; Hai Li

arXiv:2502.17591·cs.CL·March 12, 2025

Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

Martin Kuo, Jingyang Zhang, Jianyi Zhang, Minxue Tang, Louis, DiValentin, Aolin Ding, Jingwei Sun, William Chen, Amin Hass, Tianlong Chen,, Yiran Chen, Hai Li

PDF

Open Access 1 Video

TL;DR

This paper introduces Proactive Privacy Amnesia, a novel method inspired by cognitive science, to protect PII in large language models by actively forgetting sensitive information with minimal impact on model utility.

Contribution

The paper proposes a new proactive forgetting mechanism for LLMs that effectively safeguards PII while preserving model performance, outperforming existing methods.

Findings

01

Complete elimination of phone number exposure

02

Significant reduction in address exposure (9.8% - 87.6%)

03

Maintains comparable model utility

Abstract

With the rise of large language models (LLMs), increasing research has recognized their risk of leaking personally identifiable information (PII) under malicious attacks. Although efforts have been made to protect PII in LLMs, existing methods struggle to balance privacy protection with maintaining model utility. In this paper, inspired by studies of amnesia in cognitive science, we propose a novel approach, Proactive Privacy Amnesia (PPA), to safeguard PII in LLMs while preserving their utility. This mechanism works by actively identifying and forgetting key memories most closely associated with PII in sequences, followed by a memory implanting using suitable substitute memories to maintain the LLM's functionality. We conduct evaluations across multiple models to protect common PII, such as phone numbers and physical addresses, against prevalent PII-targeted attacks, demonstrating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data