Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching
Ruonan Yu, Songhua Liu, Jingwen Ye, Xinchao Wang

TL;DR
Teddy introduces a Taylor-approximated dataset distillation method that significantly improves efficiency and performance for large-scale datasets like ImageNet by reducing computational complexity and leveraging pre-cached models.
Contribution
The paper proposes a novel Taylor-based approximation and a pre-cached model pool to enhance large-scale dataset distillation efficiency and effectiveness.
Findings
Achieves state-of-the-art efficiency and performance on ImageNet datasets.
Reduces runtime by 46.6% compared to prior methods.
Surpasses previous methods by up to 12.8% in performance.
Abstract
Dataset distillation or condensation refers to compressing a large-scale dataset into a much smaller one, enabling models trained on this synthetic dataset to generalize effectively on real data. Tackling this challenge, as defined, relies on a bi-level optimization algorithm: a novel model is trained in each iteration within a nested loop, with gradients propagated through an unrolled computation graph. However, this approach incurs high memory and time complexity, posing difficulties in scaling up to large datasets such as ImageNet. Addressing these concerns, this paper introduces Teddy, a Taylor-approximated dataset distillation framework designed to handle large-scale dataset and enhance efficiency. On the one hand, backed up by theoretical analysis, we propose a memory-efficient approximation derived from Taylor expansion, which transforms the original form dependent on multi-step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Data Stream Mining Techniques
MethodsBalanced Selection
