Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE
Faris Chaudhry

TL;DR
This paper provides a theoretical analysis of Langevin dynamics in contrastive learning, showing how temperature schedules affect convergence to optimal representations, and linking contrastive learning with simulated annealing.
Contribution
It introduces a rigorous theoretical framework modeling embedding evolution under Langevin dynamics, extending simulated annealing guarantees to contrastive learning.
Findings
Logarithmic inverse-temperature schedules ensure convergence to global optima.
Faster temperature schedules may trap the model in suboptimal minima.
The analysis links contrastive learning dynamics to classical annealing theory.
Abstract
The InfoNCE loss in contrastive learning depends critically on a temperature parameter, yet its dynamics under fixed versus annealed schedules remain poorly understood. We provide a theoretical analysis by modeling embedding evolution under Langevin dynamics on a compact Riemannian manifold. Under mild smoothness and energy-barrier assumptions, we show that classical simulated annealing guarantees extend to this setting: slow logarithmic inverse-temperature schedules ensure convergence in probability to a set of globally optimal representations, while faster schedules risk becoming trapped in suboptimal minima. Our results establish a link between contrastive learning and simulated annealing, providing a principled basis for understanding and tuning temperature schedules.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Ferroelectric and Negative Capacitance Devices · Stochastic Gradient Optimization Techniques
