Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

Zihua Zhao; Feng Hong; Mengxi Chen; Pengyi Chen; Benyuan Liu; Jiangchao Yao; Ya Zhang; Yanfeng Wang

arXiv:2507.12998·cs.CV·July 18, 2025

Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

Zihua Zhao, Feng Hong, Mengxi Chen, Pengyi Chen, Benyuan Liu, Jiangchao Yao, Ya Zhang, Yanfeng Wang

PDF

Open Access

TL;DR

This paper introduces DISSect, a differential-informed sample selection method that improves the efficiency of multimodal contrastive learning by effectively identifying and excluding noisy data, leading to faster training and better performance.

Contribution

The paper proposes a novel differential-based sample selection approach that addresses noisy correspondence in contrastive learning, with theoretical analysis and extensive experimental validation.

Findings

01

DISSect outperforms state-of-the-art methods on benchmark datasets.

02

The differential between current and historical model predictions effectively identifies noisy samples.

03

The method accelerates training while maintaining or improving model performance.

Abstract

The remarkable success of contrastive-learning-based multimodal models has been greatly driven by training on ever-larger datasets with expensive compute consumption. Sample selection as an alternative efficient paradigm plays an important direction to accelerate the training process. However, recent advances on sample selection either mostly rely on an oracle model to offline select a high-quality coreset, which is limited in the cold-start scenarios, or focus on online selection based on real-time model predictions, which has not sufficiently or efficiently considered the noisy correspondence. To address this dilemma, we propose a novel Differential-Informed Sample Selection (DISSect) method, which accurately and efficiently discriminates the noisy correspondence for training acceleration. Specifically, we rethink the impact of noisy correspondence on contrastive learning and propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Face and Expression Recognition