Nob-MIAs: Non-biased Membership Inference Attacks Assessment on Large   Language Models with Ex-Post Dataset Construction

C\'edric Eichler; Nathan Champeil; Nicolas Anciaux; Alexandra; Bensamoun; Heber Hwang Arcolezi; Jos\'e Maria De Fuentes

arXiv:2408.05968·cs.CR·January 17, 2025

Nob-MIAs: Non-biased Membership Inference Attacks Assessment on Large Language Models with Ex-Post Dataset Construction

C\'edric Eichler, Nathan Champeil, Nicolas Anciaux, Alexandra, Bensamoun, Heber Hwang Arcolezi, Jos\'e Maria De Fuentes

PDF

1 Repo

TL;DR

This paper introduces methods to create unbiased datasets for evaluating Membership Inference Attacks on Large Language Models, revealing that bias removal reduces attack effectiveness and highlights the importance of dataset fairness.

Contribution

It proposes algorithms for constructing non-biased datasets for fairer MIA evaluation on LLMs, addressing distributional biases in ex-post assessments.

Findings

01

Bias removal diminishes MIA effectiveness

02

Non-biased datasets yield AUC-ROC scores similar to random datasets

03

Most MIAs perform near random when biases are neutralized

Abstract

The rise of Large Language Models (LLMs) has triggered legal and ethical concerns, especially regarding the unauthorized use of copyrighted materials in their training datasets. This has led to lawsuits against tech companies accused of using protected content without permission. Membership Inference Attacks (MIAs) aim to detect whether specific documents were used in a given LLM pretraining, but their effectiveness is undermined by biases such as time-shifts and n-gram overlaps. This paper addresses the evaluation of MIAs on LLMs with partially inferable training sets, under the ex-post hypothesis, which acknowledges inherent distributional biases between members and non-members datasets. We propose and validate algorithms to create ``non-biased'' and ``non-classifiable'' datasets for fairer MIA assessment. Experiments using the Gutenberg dataset on OpenLamma and Pythia show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ceichler/MIA-bias-removal
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.