Towards Benchmarking Privacy Vulnerabilities in Selective Forgetting with Large Language Models
Wei Qian, Chenxu Zhao, Yangyi Li, Mengdi Huai

TL;DR
This paper introduces the first comprehensive benchmark to evaluate privacy vulnerabilities in selective forgetting for large language models, analyzing various attacks, methods, and architectures to standardize privacy assessment.
Contribution
It provides a systematic benchmark for privacy leakage in machine unlearning, enabling fair comparison and better understanding of privacy risks in selective forgetting.
Findings
Identifies key factors influencing privacy leakage
Evaluates state-of-the-art unlearning privacy attacks
Provides insights for privacy-preserving unlearning methods
Abstract
The rapid advancements in artificial intelligence (AI) have primarily focused on the process of learning from data to acquire knowledgeable learning systems. As these systems are increasingly deployed in critical areas, ensuring their privacy and alignment with human values is paramount. Recently, selective forgetting (also known as machine unlearning) has shown promise for privacy and data removal tasks, and has emerged as a transformative paradigm shift in the field of AI. It refers to the ability of a model to selectively erase the influence of previously seen data, which is especially important for compliance with modern data protection regulations and for aligning models with human values. Despite its promise, selective forgetting raises significant privacy concerns, especially when the data involved come from sensitive domains. While new unlearning-induced privacy attacks are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning
