UDD: Dataset Distillation via Mining Underutilized Regions

Shiguang Wang; Zhongyu Zhang; Jian Cheng

arXiv:2408.16268·cs.CV·August 30, 2024

UDD: Dataset Distillation via Mining Underutilized Regions

Shiguang Wang, Zhongyu Zhang, Jian Cheng

PDF

Open Access

TL;DR

This paper introduces UDD, a novel dataset distillation method that identifies and exploits underutilized regions in synthetic images to improve dataset efficiency and model performance across multiple datasets.

Contribution

UDD proposes dynamic policies to search and utilize underutilized regions in synthetic data, enhancing dataset utilization and model accuracy in dataset distillation.

Findings

01

Outperforms state-of-the-art on MNIST, FashionMNIST, SVHN, CIFAR-10, CIFAR-100

02

Improves CIFAR-10 and CIFAR-100 accuracy by 4.0% and 3.7% respectively

03

Uses category-wise feature contrastive loss to enhance class distinguishability

Abstract

Dataset distillation synthesizes a small dataset such that a model trained on this set approximates the performance of the original dataset. Recent studies on dataset distillation focused primarily on the design of the optimization process, with methods such as gradient matching, feature alignment, and training trajectory matching. However, little attention has been given to the issue of underutilized regions in synthetic images. In this paper, we propose UDD, a novel approach to identify and exploit the underutilized regions to make them informative and discriminate, and thus improve the utilization of the synthetic dataset. Technically, UDD involves two underutilized regions searching policies for different conditions, i.e., response-based policy and data jittering-based policy. Compared with previous works, such two policies are utilization-sensitive, equipping with the ability to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications

MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training