Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

Hongyan Chang; Ali Shahin Shamsabadi; Kleomenis Katevas; Hamed Haddadi; Reza Shokri

arXiv:2409.13745·cs.CL·September 17, 2025

Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

Hongyan Chang, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Reza Shokri

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel membership inference attack tailored for pre-trained large language models, leveraging perplexity dynamics to effectively identify training data membership, surpassing previous methods.

Contribution

The paper proposes a new MIA method for LLMs that accounts for their generative nature and context-dependent memorization, improving attack accuracy.

Findings

01

Outperforms prior MIAs on LLMs

02

Reveals context-dependent memorization patterns

03

Effective in identifying training data membership

Abstract

Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs) aim at determining if a data point was part of the model's training set. Prior MIAs that are built for classification models fail at LLMs, due to ignoring the generative nature of LLMs across token sequences. In this paper, we present a novel attack on pre-trained LLMs that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior approaches, revealing context-dependent memorization patterns in pre-trained LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Context-Aware Membership Inference Attacks against Pre-trained Large Language Models· underline

Taxonomy

TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education