Sampling-based Pseudo-Likelihood for Membership Inference Attacks
Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki

TL;DR
This paper introduces SaMIA, a sampling-based pseudo-likelihood method for membership inference attacks on large language models, effective even when likelihoods are inaccessible, by analyzing generated text samples.
Contribution
The paper proposes SaMIA, a novel likelihood-free MIA method that uses sampling and n-gram matching to detect training data membership in LLMs.
Findings
SaMIA performs comparably to likelihood-based methods.
SaMIA works on proprietary models without access to likelihoods.
The method effectively detects training data leaks.
Abstract
Large Language Models (LLMs) are trained on large-scale web data, which makes it difficult to grasp the contribution of each text. This poses the risk of leaking inappropriate data such as benchmarks, personal information, and copyrighted texts in the training data. Membership Inference Attacks (MIA), which determine whether a given text is included in the model's training data, have been attracting attention. Previous studies of MIAs revealed that likelihood-based classification is effective for detecting leaks in LLMs. However, the existing methods cannot be applied to some proprietary models like ChatGPT or Claude 3 because the likelihood is unavailable to the user. In this study, we propose a Sampling-based Pseudo-Likelihood (\textbf{SPL}) method for MIA (\textbf{SaMIA}) that calculates SPL using only the text generated by an LLM to detect leaks. The SaMIA treats the target text as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNetwork Security and Intrusion Detection · Information and Cyber Security · Spam and Phishing Detection
MethodsSemi-Pseudo-Label
