The "Law" of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
Yongwei Che, Benjamin Eysenbach

TL;DR
This paper provides a probabilistic framework for understanding how contrastive learning of paired modalities can be extended to infer relationships between unseen modality pairs, supported by theoretical proofs and numerical experiments.
Contribution
It introduces a Bayesian approach to contrastive learning that justifies aligning unpaired modalities and explores new applications in pre-trained models and language ambiguity handling.
Findings
Contrastive embeddings can recover likelihood ratios for unpaired modalities.
Theoretical proof under certain assumptions supports the alignment of unseen modality pairs.
Numerical experiments validate the proposed approach and its applications.
Abstract
While internet-scale data often comes in pairs (e.g., audio/image, image/text), we often want to perform inferences over modalities unseen together in the training data (e.g., audio/text). Empirically, this can often be addressed by learning multiple contrastive embedding spaces between existing modality pairs, implicitly hoping that unseen modality pairs will end up being aligned. This theoretical paper proves that this hope is well founded, under certain assumptions. Starting with the proper Bayesian approach of integrating out intermediate modalities, we show that directly comparing the representations of data from unpaired modalities can recover the same likelihood ratio. Our analysis builds on prior work on the geometry and probabilistic interpretation of contrastive representations, showing how these representations can answer many of the same inferences as probabilistic graphical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTechnology and Human Factors in Education and Health · Advanced Research in Systems and Signal Processing · Advanced Data Processing Techniques
