Automatic Calibration for Membership Inference Attack on Large Language   Models

Saleh Zare Zade; Yao Qiang; Xiangyu Zhou; Hui Zhu; Mohammad Amin; Roshani; Prashant Khanduri; Dongxiao Zhu

arXiv:2505.03392·cs.LG·May 7, 2025

Automatic Calibration for Membership Inference Attack on Large Language Models

Saleh Zare Zade, Yao Qiang, Xiangyu Zhou, Hui Zhu, Mohammad Amin, Roshani, Prashant Khanduri, Dongxiao Zhu

PDF

Open Access 1 Repo

TL;DR

This paper presents ACMIA, a novel, calibration-based framework for membership inference attacks on large language models that improves accuracy and robustness without relying on reference models.

Contribution

We propose ACMIA, a tunable, theoretically motivated calibration method for MIAs on LLMs, enhancing reliability and generalizability over existing approaches.

Findings

01

Outperforms state-of-the-art baselines across multiple benchmarks.

02

Effective in different access scenarios to LLMs.

03

Demonstrates robustness and generalizability in extensive experiments.

Abstract

Membership Inference Attacks (MIAs) have recently been employed to determine whether a specific text was part of the pre-training data of Large Language Models (LLMs). However, existing methods often misinfer non-members as members, leading to a high false positive rate, or depend on additional reference models for probability calibration, which limits their practicality. To overcome these challenges, we introduce a novel framework called Automatic Calibration Membership Inference Attack (ACMIA), which utilizes a tunable temperature to calibrate output probabilities effectively. This approach is inspired by our theoretical insights into maximum likelihood estimation during the pre-training of LLMs. We introduce ACMIA in three configurations designed to accommodate different levels of model access and increase the probability gap between members and non-members, improving the reliability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salehzz/acmia
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling