TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution
Fengli Ran, Xiao Pu, Bo Liu, Xiuli Bi, Bin Xiao

TL;DR
TGDD introduces a dynamic dataset distillation method that aligns synthetic data with evolving feature representations during training, enhancing downstream task performance efficiently.
Contribution
It reformulates distribution matching as a trajectory-guided process, capturing evolving semantics and reducing class overlap for improved dataset distillation.
Findings
Achieves state-of-the-art results on ten datasets.
Gains 5.0% accuracy on high-resolution benchmarks.
Balances performance and efficiency without extra optimization.
Abstract
Dataset distillation compresses large datasets into compact synthetic ones to reduce storage and computational costs. Among various approaches, distribution matching (DM)-based methods have attracted attention for their high efficiency. However, they often overlook the evolution of feature representations during training, which limits the expressiveness of synthetic data and weakens downstream performance. To address this issue, we propose Trajectory Guided Dataset Distillation (TGDD), which reformulates distribution matching as a dynamic alignment process along the model's training trajectory. At each training stage, TGDD captures evolving semantics by aligning the feature distribution between the synthetic and original dataset. Meanwhile, it introduces a distribution constraint regularization to reduce class overlap. This design helps synthetic data preserve both semantic diversity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
