SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching
Yongmin Lee, Hye Won Chung

TL;DR
SelMatch is a new dataset distillation method that improves the synthesis of small datasets by using selection-based initialization and partial updates, effectively handling larger IPC scales and outperforming existing methods.
Contribution
Introduces SelMatch, a novel trajectory-matching based dataset distillation technique that scales effectively with IPC by combining selection-based initialization and partial updates.
Findings
SelMatch outperforms existing methods across various IPC scales.
Effective in synthesizing small datasets on CIFAR-10/100 and TinyImageNet.
Maintains high performance even as dataset size increases.
Abstract
Dataset distillation aims to synthesize a small number of images per class (IPC) from a large dataset to approximate full dataset training with minimal performance loss. While effective in very small IPC ranges, many distillation methods become less effective, even underperforming random sample selection, as IPC increases. Our examination of state-of-the-art trajectory-matching based distillation methods across various IPC scales reveals that these methods struggle to incorporate the complex, rare features of harder samples into the synthetic dataset even with the increased IPC, resulting in a persistent coverage gap between easy and hard test samples. Motivated by such observations, we introduce SelMatch, a novel distillation method that effectively scales with IPC. SelMatch uses selection-based initialization and partial updates through trajectory matching to manage the synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Neural Network Applications · Graph Theory and Algorithms
