Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Dariush Wahdany (1); Matthew Jagielski (2); Adam Dziedzic (1); Franziska Boenisch (1) ((1) CISPA Helmholtz Center for Information Security; (2) Anthropic)

arXiv:2603.00811·cs.LG·March 3, 2026

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Dariush Wahdany (1), Matthew Jagielski (2), Adam Dziedzic (1), Franziska Boenisch (1) ((1) CISPA Helmholtz Center for Information Security, (2) Anthropic)

PDF

Open Access 3 Reviews

TL;DR

This paper reveals that data curation processes in machine learning can leak private information through various stages, even when models are trained only on curated public data, and proposes privacy-preserving adaptations.

Contribution

It introduces novel membership inference attacks targeting curation steps and demonstrates that privacy risks exist beyond model training, proposing differentially private curation methods as mitigation.

Findings

01

Curation stages leak private data information.

02

Models trained on curated data can still reveal private membership.

03

Differential privacy effectively reduces leakage.

Abstract

In machine learning, curation is used to select the most valuable data for improving both model accuracy and computational efficiency. Recently, curation has also been explored as a solution for private machine learning: rather than training directly on sensitive data, which is known to leak information through model predictions, the private data is used only to guide the selection of useful public data. The resulting model is then trained solely on curated public data. It is tempting to assume that such a model is privacy-preserving because it has never seen the private data. Yet, we show that without further protection, curation pipelines can still leak private information. Specifically, we introduce novel attacks against popular curation methods, targeting every major step: the computation of curation scores, the selection of the curated subset, and the final trained model. We…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 2Confidence 3

Strengths

- This is the first systematic analysis of privacy leakage in data curation pipelines, addressing a genuine blind spot in the community. it systematically evaluates privacy leakage across three critical stages of the curation pipeline (curation scores, selected data subsets, and the final trained models) using diverse datasets and curation methods. - The analysis of influence sparsity in image-based curation provides valuable insights into why certain methods are more vulnerable. - The paper d

Weaknesses

- The threat model is not clearly defined. The adversary's assumed capabilities and knowledge are not clearly stated upfront and appear to change depending on the attack. The adversary knowledge could be escalates to extreme, white-box levels. For the end-to-end TRAK attack, the adversary is assumed to have profound knowledge of the curation mechanism, including the model architecture, the ability to compute gradients, and the need to calculate the Gram matrix $G$. This level of white-box access

Reviewer 02Rating 6Confidence 3

Strengths

1. The paper reveals new privacy risks of data curation via membership inference attacks. 2. Systematic review of MIAs in all steps in the data curation pipeline, and attacks are proposed and validated for all of them. 3. Existing MIAs work for agnostic with modifications, but the paper also proposes custom attacks are proposed to target concrete data curation methods, including TRAK and image-based data curation.

Weaknesses

1. Choice of parameters are not clear or discussed in the main text. 2. Computational complexity of the attacks is not discussed. 3. Limited experiments for end-to-end model MIAs. Please see the questions listed in the section below.

Reviewer 03Rating 2Confidence 3

Strengths

The problem itself seems interesting.

Weaknesses

1. I’m familiar with LiRA and recognize you use an online variant, but the three attack surfaces are hard to follow because the threat model isn’t explicitly stated up front. Please spell out—on one page—(i) the adversary’s goal (membership in the private target set used for curation vs. classic training-set membership), (ii) what the adversary can observe at each stage (scores, selection mask, final model), and (iii) what the adversary can do (e.g., can they inject public items?). A single “who

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI