Mutual Information guided Visual Contrastive Learning

Hanyang Chen; Yanchao Yang

arXiv:2511.00028·cs.CV·November 4, 2025

Mutual Information guided Visual Contrastive Learning

Hanyang Chen, Yanchao Yang

PDF

Open Access

TL;DR

This paper introduces a mutual information-based data selection method for contrastive learning, aiming to improve feature generalization by leveraging real-world distribution insights rather than human-designed augmentations.

Contribution

It proposes a novel data augmentation strategy guided by mutual information, enhancing contrastive learning without relying on manual hypotheses or engineering.

Findings

01

Improved generalization of learned features in open environments.

02

Effective across multiple state-of-the-art frameworks.

03

Establishes mutual information as a promising direction for data augmentation.

Abstract

Representation learning methods utilizing the InfoNCE loss have demonstrated considerable capacity in reducing human annotation effort by training invariant neural feature extractors. Although different variants of the training objective adhere to the information maximization principle between the data and learned features, data selection and augmentation still rely on human hypotheses or engineering, which may be suboptimal. For instance, data augmentation in contrastive learning primarily focuses on color jittering, aiming to emulate real-world illumination changes. In this work, we investigate the potential of selecting training data based on their mutual information computed from real-world distributions, which, in principle, should endow the learned features with better generalization when applied in open environments. Specifically, we consider patches attached to scenes that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Visual Attention and Saliency Detection