When Dynamic Data Selection Meets Data Augmentation

Suorong Yang; Peng Ye; Furao Shen; Dongzhan Zhou

arXiv:2505.03809·cs.LG·May 13, 2025

When Dynamic Data Selection Meets Data Augmentation

Suorong Yang, Peng Ye, Furao Shen, Dongzhan Zhou

PDF

Open Access

TL;DR

This paper introduces a novel online framework that unifies dynamic data selection and augmentation, significantly reducing training costs while maintaining or improving model performance and robustness.

Contribution

It proposes a joint estimation method for data selection and augmentation, optimizing their synergy for efficient training and enhanced generalization.

Findings

01

Reduces 50% training costs on ImageNet-1k without performance loss.

02

Improves model robustness and noise resistance.

03

Outperforms existing methods on various benchmarks.

Abstract

Dynamic data selection aims to accelerate training with lossless performance. However, reducing training data inherently limits data diversity, potentially hindering generalization. While data augmentation is widely used to enhance diversity, it is typically not optimized in conjunction with selection. As a result, directly combining these techniques fails to fully exploit their synergies. To tackle the challenge, we propose a novel online data training framework that, for the first time, unifies dynamic data selection and augmentation, achieving both training efficiency and enhanced performance. Our method estimates each sample's joint distribution of local density and multimodal semantic consistency, allowing for the targeted selection of augmentation-suitable samples while suppressing the inclusion of noisy or ambiguous data. This enables a more significant reduction in dataset size…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Advanced Neural Network Applications