In-Context Probing for Membership Inference in Fine-Tuned Language Models
Zhexi Lu, Hongliang Chi, Nathalie Baracaldo, Swanand Ravindra Kadhe, Yuseok Jeon, Lei Yu

TL;DR
This paper introduces ICP-MIA, a novel black-box membership inference attack leveraging training dynamics and in-context probing to effectively identify whether data was used in training large language models, enhancing privacy auditing capabilities.
Contribution
The paper presents ICP-MIA, a training-free, theoretically grounded framework that improves membership inference attacks on LLMs by estimating the optimization gap through in-context probing strategies.
Findings
ICP-MIA outperforms prior black-box MIAs at low false positive rates.
The effectiveness of ICP-MIA depends on reference data alignment and model configurations.
The approach provides a practical tool for privacy auditing of deployed LLMs.
Abstract
Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs), especially when models are adapted to domain-specific tasks using sensitive data. While prior black-box MIA techniques rely on confidence scores or token likelihoods, these signals are often entangled with a sample's intrinsic properties - such as content difficulty or rarity - leading to poor generalization and low signal-to-noise ratios. In this paper, we propose ICP-MIA, a novel MIA framework grounded in the theory of training dynamics, particularly the phenomenon of diminishing returns during optimization. We introduce the Optimization Gap as a fundamental signal of membership: at convergence, member samples exhibit minimal remaining loss-reduction potential, while non-members retain significant potential for further optimization. To estimate this gap in a black-box…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Topic Modeling · Adversarial Robustness in Machine Learning
