Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory
Wenliang Zhong, Haoyu Tang, Qinghai Zheng, Mingzhu Xu and, Yupeng Hu, Liqiang Nie

TL;DR
This paper introduces Matching Convexified Trajectory (MCT), a novel dataset distillation method that improves stability, convergence speed, and storage efficiency by leveraging neural tangent kernel insights and convex trajectory guidance.
Contribution
The paper proposes MCT, a new approach that transforms the objective function to enhance dataset distillation by stabilizing and accelerating the training trajectory guidance.
Findings
MCT outperforms traditional MTT in stability and convergence speed.
MCT reduces storage requirements for expert trajectories.
Experimental results validate the effectiveness of MCT across multiple datasets.
Abstract
The rapid evolution of deep learning and large language models has led to an exponential growth in the demand for training data, prompting the development of Dataset Distillation methods to address the challenges of managing large datasets. Among these, Matching Training Trajectories (MTT) has been a prominent approach, which replicates the training trajectory of an expert network on real data with a synthetic dataset. However, our investigation found that this method suffers from three significant limitations: 1. Instability of expert trajectory generated by Stochastic Gradient Descent (SGD); 2. Low convergence speed of the distillation process; 3. High storage consumption of the expert trajectory. To address these issues, we offer a new perspective on understanding the essence of Dataset Distillation and MTT through a simple transformation of the objective function, and introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
