Dark Distillation: Backdooring Distilled Datasets without Accessing Raw Data
Ziyuan Yang, Ming Yan, Yi Zhang, Joey Tianyi Zhou

TL;DR
This paper reveals that distilled datasets, used for efficient data sharing, are vulnerable to backdoor attacks even without access to raw data, by reconstructing class archetypes and injecting malicious triggers.
Contribution
It introduces a novel backdoor attack method on distilled datasets that does not require raw data access, demonstrating high vulnerability and efficiency.
Findings
Distilled datasets are highly susceptible to backdoor attacks.
Attack can be performed without raw data access.
The method is efficient, taking less than one minute in some cases.
Abstract
Dataset distillation (DD) enhances training efficiency and reduces bandwidth by condensing large datasets into smaller synthetic ones. It enables models to achieve performance comparable to those trained on the raw full dataset and has become a widely adopted method for data sharing. However, security concerns in DD remain underexplored. Existing studies typically assume that malicious behavior originates from dataset owners during the initial distillation process, where backdoors are injected into raw datasets. In contrast, this work is the first to address a more realistic and concerning threat: attackers may intercept the dataset distribution process, inject backdoors into the distilled datasets, and redistribute them to users. While distilled datasets were previously considered resistant to backdoor attacks, we demonstrate that they remain vulnerable to such attacks. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
