Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Suorong Yang; Peijia Li; Yujie Liu; Zhiming Xu; Peng Ye; Wanli Ouyang; Furao Shen; Dongzhan Zhou

arXiv:2507.12750·cs.LG·July 18, 2025

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Suorong Yang, Peijia Li, Yujie Liu, Zhiming Xu, Peng Ye, Wanli Ouyang, Furao Shen, Dongzhan Zhou

PDF

Open Access

TL;DR

This paper proposes a dynamic dataset pruning method that adaptively selects training samples using cross-modality semantic consistency and pretrained models, improving efficiency and robustness in data-centric deep learning.

Contribution

It introduces a novel dynamic pruning framework leveraging cross-modality alignment and pretrained models for more robust and efficient data selection.

Findings

01

Enhanced training efficiency through adaptive sample selection.

02

Improved model robustness across different domains.

03

Effective filtering of uninformative samples.

Abstract

Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance. However, most existing methods rely on static heuristics or task-specific metrics, limiting their robustness and generalizability across domains. In this work, we introduce a dynamic dataset pruning framework that adaptively selects training samples based on both task-driven difficulty and cross-modality semantic consistency. By incorporating supervision from pretrained multimodal foundation models, our approach captures training dynamics while effectively filtering out uninformative samples. Our work highlights the potential of integrating cross-modality alignment for robust sample selection, advancing data-centric learning toward more efficient and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition