Dataset distillation for memorized data: Soft labels can leak held-out teacher knowledge

Freya Behrens; Lenka Zdeborov\'a

arXiv:2506.14457·cs.LG·February 23, 2026

Dataset distillation for memorized data: Soft labels can leak held-out teacher knowledge

Freya Behrens, Lenka Zdeborov\'a

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper demonstrates that neural network students trained with soft labels from teachers can memorize and transfer specific facts, including held-out data, even without generalization, highlighting risks of information leakage in dataset distillation.

Contribution

It reveals that soft labels can leak memorized teacher knowledge to students, enabling perfect recall of held-out data, and analyzes how this depends on label smoothing temperature.

Findings

01

Students can memorize held-out data with soft labels.

02

Memorization persists across architectures and datasets.

03

Temperature controls the extent of information leakage.

Abstract

Dataset distillation aims to compress training data into fewer examples via a teacher, from which a student can learn effectively. While its success is often attributed to structure in the data, modern neural networks also memorize specific facts, but if and how such memorized information is can transferred in distillation settings remains less understood. In this work, we show that students trained on soft labels from teachers can achieve non-trivial accuracy on held-out memorized data they never directly observed. This effect persists on structured data when the teacher has not generalized.To analyze it in isolation, we consider finite random i.i.d. datasets where generalization is a priori impossible and a successful teacher fit implies pure memorization. Still, students can learn non-trivial information about the held-out data, in some cases up to perfect accuracy. In those…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

This paper tackles a very cool and relevant topic: whether dataset distillation, specifically using soft labels, can transfer memorized, held-out information from a teacher to a student. The work makes several very interesting observations on this front, and its primary strength lies in the careful experimental design used to isolate this phenomenon. The modular addition experiments, for instance, provide a compelling and clean demonstration of the difference in transfer outcomes between a gener

Weaknesses

The paper's primary weaknesses lie in the clarity of its presentation and the limited scope of the experiments. * **Clarity and Focus:** The paper presents a wide array of experimental setups (modular addition, logistic regression, MLPs, varying temperature, classes, $\rho$, etc.) but struggles to synthesize them into a cohesive narrative. Many interesting results are condensed into dense paragraphs, making it difficult for the reader to dissect the core takeaways from each experiment. * **Sugg

Reviewer 02Rating 6Confidence 3

Strengths

I enjoyed reading the submission. The experiments seem well-executed and are designed well: they do a good job probing the phenomenon. I will likely study this paper further, and I expect many other people will find it interesting.

Weaknesses

My main criticism of the paper is that the main takeaway, "soft labels can leak held-out teacher knowledge," appears to be known. Theorem 1 of Phuong and Lampert (2019), who the authors do cite as related work, shows that soft labels in a linear classification setting lead to the student recovering the teacher's weights exactly. Now, this submission has a lot of results beyond the headline, so the work is still valuable. But the takeaways are less clear-cut. With so many experiments, the main

Reviewer 03Rating 4Confidence 2

Strengths

- The authors have done a meticulous and rigorous job in terms of clearly defining their hypothesis and constructing a specific set experiments which is used to study its validity. - Although the experimental framework is constrained (the tasks studied and the networks used), the results presented seem convincing in terms of proving the authors' hypothesis

Weaknesses

- Motivation: It is not clear what is the motivation of the presented experiments. Even if the paper's hypothesis is true what does that mean in practice and that are the applications? Why should the community pay attention to the results of this paper? While this is somewhat described in privacy paragraph of section 2, more clarity is needed to understand the paper's motivation. - Impact of presented results: Can the paper's results be applied to real-world tasks, datasets and networks? Unfortu

Code & Models

Repositories

spoc-group/dataset-distillation-memorization
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification