Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Yu Gui, Cong Ma, Zongming Ma

TL;DR
This paper provides a theoretical analysis of multi-modal contrastive learning, showing it adapts to the intrinsic data dimensions and effectively learns low-dimensional, informative representations, supported by experiments on synthetic and real datasets.
Contribution
It offers a novel theoretical understanding of how contrastive learning adapts to data intrinsic dimensions beyond linear settings.
Findings
Contrastive learning maximizes mutual information between modalities.
It adapts to lower intrinsic data dimensions than the preset dimensions.
Experimental results confirm the ability to learn low-dimensional, informative representations.
Abstract
Multi-modal contrastive learning as a self-supervised representation learning technique has achieved great success in foundation model training, such as CLIP~\citep{radford2021learning}. In this paper, we study the theoretical properties of the learned representations from multi-modal contrastive learning beyond linear representations and specific data distributions. Our analysis reveals that, enabled by temperature optimization, multi-modal contrastive learning not only maximizes mutual information between modalities but also adapts to intrinsic dimensions of data, which can be much lower than user-specified dimensions for representation vectors. Experiments on both synthetic and real-world datasets demonstrate the ability of contrastive learning to learn low-dimensional and informative representations, bridging theoretical insights and practical performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
MethodsContrastive Learning
