Understanding InfoNCE: Transition Probability Matrix Induced Feature Clustering
Ge Cheng, Shuo Wang, Yun Zhang

TL;DR
This paper provides a theoretical understanding of InfoNCE by modeling data augmentation dynamics with a transition probability matrix, and introduces SC-InfoNCE, a new loss that improves feature clustering and performance across multiple domains.
Contribution
It introduces a feature space model with a transition matrix to explain InfoNCE's behavior and proposes SC-InfoNCE, a flexible loss function for better feature alignment.
Findings
SC-InfoNCE improves feature clustering in various domains.
Theoretical model explains InfoNCE's optimization behavior.
Experiments show consistent performance gains across datasets.
Abstract
Contrastive learning has emerged as a cornerstone of unsupervised representation learning across vision, language, and graph domains, with InfoNCE as its dominant objective. Despite its empirical success, the theoretical underpinnings of InfoNCE remain limited. In this work, we introduce an explicit feature space to model augmented views of samples and a transition probability matrix to capture data augmentation dynamics. We demonstrate that InfoNCE optimizes the probability of two views sharing the same source toward a constant target defined by this matrix, naturally inducing feature clustering in the representation space. Leveraging this insight, we propose Scaled Convergence InfoNCE (SC-InfoNCE), a novel loss function that introduces a tunable convergence target to flexibly control feature similarity alignment. By scaling the target matrix, SC-InfoNCE enables flexible control over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks · Multimodal Machine Learning Applications
