Window-based Membership Inference Attacks Against Fine-tuned Large Language Models

Yuetian Chen; Yuntao Du; Kaiyuan Zhang; Ashish Kundu; Charles Fleming; Bruno Ribeiro; Ninghui Li

arXiv:2601.02751·cs.CL·March 9, 2026

Window-based Membership Inference Attacks Against Fine-tuned Large Language Models

Yuetian Chen, Yuntao Du, Kaiyuan Zhang, Ashish Kundu, Charles Fleming, Bruno Ribeiro, Ninghui Li

PDF

Open Access

TL;DR

This paper introduces WBC, a window-based approach that enhances membership inference attacks on fine-tuned large language models by focusing on localized signals, significantly improving detection effectiveness over traditional global methods.

Contribution

The paper proposes a novel window-based comparison method that captures localized memorization signals, outperforming existing global-averaging approaches in membership inference attacks on LLMs.

Findings

01

WBC achieves higher AUC scores than baselines.

02

Detection rates improve 2-3 times at low false positive levels.

03

Localized signals are more effective for membership inference.

Abstract

Most membership inference attacks (MIAs) against Large Language Models (LLMs) rely on global signals, like average loss, to identify training data. This approach, however, dilutes the subtle, localized signals of memorization, reducing attack effectiveness. We challenge this global-averaging paradigm, positing that membership signals are more pronounced within localized contexts. We introduce WBC (Window-Based Comparison), which exploits this insight through a sliding window approach with sign-based aggregation. Our method slides windows of varying sizes across text sequences, with each window casting a binary vote on membership based on loss comparisons between target and reference models. By ensembling votes across geometrically spaced window sizes, we capture memorization patterns from token-level artifacts to phrase-level structures. Extensive experiments across eleven datasets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)