Exploring the limits of strong membership inference attacks on large language models
Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, George Kaissis, Milad Nasr, Sahra Ghalebikesabi, Meenatchi Sundaram Mutu Selva Annamalai, Niloofar Mireshghallah, Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye, Katherine Lee

TL;DR
This paper investigates the effectiveness of strong membership inference attacks on large language models, revealing their limited success and instability, and challenging prior assumptions about LLM privacy vulnerabilities.
Contribution
It scales a strong MIA to large LLMs, demonstrating limited practical effectiveness and revealing instability and complex relationships with privacy metrics.
Findings
Strong MIAs can succeed on large LLMs but with limited effectiveness.
Many MIA decisions are unstable and indistinguishable from random chance.
The relationship between MIA success and privacy metrics is complex.
Abstract
State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a result, prior research has either relied on weaker attacks that avoid training references (e.g., fine-tuning attacks), or on stronger attacks applied to small models and datasets. However, weaker attacks have been shown to be brittle and insights from strong attacks in simplified settings do not translate to today's LLMs. These challenges prompt an important question: are the limitations observed in prior work due to attack design choices, or are MIAs fundamentally ineffective on LLMs? We address this question by scaling LiRA--one of the strongest MIAs--to GPT-2 architectures ranging from 10M to 1B parameters, training references on over 20B tokens from the C4 dataset. Our results advance the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Dense Connections · Linear Warmup With Cosine Annealing · Attention Dropout · Softmax · Weight Decay · Multi-Head Attention
