A Statistical and Multi-Perspective Revisiting of the Membership Inference Attack in Large Language Models
Bowen Chen, Namgi Han, Yusuke Miyao

TL;DR
This study conducts a comprehensive statistical analysis of Membership Inference Attacks on Large Language Models across various settings, revealing factors affecting attack performance and highlighting challenges like threshold selection and text dissimilarity.
Contribution
It provides the first large-scale, multi-setting statistical evaluation of MIA methods, uncovering key factors influencing their effectiveness and variability.
Findings
MIA performance increases with model size and varies across domains.
Most MIA methods do not statistically outperform baselines.
Threshold decision is a critical but overlooked challenge.
Abstract
The lack of data transparency in Large Language Models (LLMs) has highlighted the importance of Membership Inference Attack (MIA), which differentiates trained (member) and untrained (non-member) data. Though it shows success in previous studies, recent research reported a near-random performance in different settings, highlighting a significant performance inconsistency. We assume that a single setting doesn't represent the distribution of the vast corpora, causing members and non-members with different distributions to be sampled and causing inconsistency. In this study, instead of a single setting, we statistically revisit MIA methods from various settings with thousands of experiments for each MIA method, along with study in text feature, embedding, threshold decision, and decoding dynamics of members and non-members. We found that (1) MIA performance improves with model size and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling
