Generative Dataset Distillation Based on Self-knowledge Distillation
Longzhen Li, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki, Haseyama

TL;DR
This paper introduces a generative dataset distillation method that leverages self-knowledge distillation and logit standardization to improve the accuracy of synthetic data in representing original datasets, outperforming existing methods.
Contribution
It proposes a novel generative dataset distillation approach that enhances distribution matching accuracy using self-knowledge distillation and logit standardization.
Findings
Outperforms state-of-the-art distillation methods
Achieves higher accuracy in synthetic data representation
Demonstrates superior distillation performance through extensive experiments
Abstract
Dataset distillation is an effective technique for reducing the cost and complexity of model training while maintaining performance by compressing large datasets into smaller, more efficient versions. In this paper, we present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits. Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data, thereby capturing the overall structure and relationships within the data. To further improve the accuracy of alignment, we introduce a standardization step on the logits before performing distribution matching, ensuring consistency in the range of logits. Through extensive experiments, we demonstrate that our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
