Lost in the Averages: A New Specific Setup to Evaluate Membership Inference Attacks Against Machine Learning Models

Nata\v{s}a Kr\v{c}o; Florent Gu\'epin; Matthieu Meeus; Bogdan Kulynych; Yves-Alexandre de Montjoye

arXiv:2405.15423·cs.LG·October 17, 2025

Lost in the Averages: A New Specific Setup to Evaluate Membership Inference Attacks Against Machine Learning Models

Nata\v{s}a Kr\v{c}o, Florent Gu\'epin, Matthieu Meeus, Bogdan Kulynych, Yves-Alexandre de Montjoye

PDF

Open Access

TL;DR

This paper introduces a new evaluation setup called the model-seeded game for membership inference attacks, providing more accurate privacy risk estimates for specific models or datasets compared to traditional methods.

Contribution

It formalizes the model-seeded game, demonstrating its effectiveness in accurately assessing privacy risks for individual records in specific datasets and models.

Findings

01

Traditional privacy game averages risk across datasets, potentially misleading.

02

Model-seeded game provides dataset-specific privacy risk estimates.

03

Up to 94% of high-risk records are overlooked by traditional methods.

Abstract

Synthetic data generators and machine learning models can memorize their training data, posing privacy concerns. Membership inference attacks (MIAs) are a standard method of estimating the privacy risk of these systems. The risk of individual records is typically computed by evaluating MIAs in a record-specific privacy game. We analyze the record-specific privacy game commonly used for evaluating attackers under realistic assumptions (the \textit{traditional} game) -- particularly for synthetic tabular data -- and show that it averages a record's privacy risk across datasets. We show this implicitly assumes the dataset a record is part of has no impact on the record's risk, providing a misleading risk estimate when a specific model or synthetic dataset is released. Instead, we propose a novel use of the leave-one-out game, used in existing work exclusively to audit differential privacy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsSparse Evolutionary Training