Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Reza Shokri

TL;DR
This paper introduces a novel membership inference attack tailored for pre-trained large language models, leveraging perplexity dynamics to effectively identify training data membership, surpassing previous methods.
Contribution
The paper proposes a new MIA method for LLMs that accounts for their generative nature and context-dependent memorization, improving attack accuracy.
Findings
Outperforms prior MIAs on LLMs
Reveals context-dependent memorization patterns
Effective in identifying training data membership
Abstract
Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs) aim at determining if a data point was part of the model's training set. Prior MIAs that are built for classification models fail at LLMs, due to ignoring the generative nature of LLMs across token sequences. In this paper, we present a novel attack on pre-trained LLMs that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior approaches, revealing context-dependent memorization patterns in pre-trained LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education
