Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse
Ruijie Jiang, Thuan Nguyen, Shuchin Aeron, Prakash Ishwar

TL;DR
This paper provides theoretical and empirical insights into how hard-negative sampling influences the geometry of learned representations in contrastive learning, demonstrating conditions under which neural collapse occurs versus dimensional collapse.
Contribution
It offers the first unified theoretical analysis of supervised and unsupervised contrastive learning that does not rely on class-conditional independence, highlighting the role of hard-negative sampling and feature normalization.
Findings
Hard-negative sampling promotes neural collapse with feature normalization.
Without normalization, learned representations tend to suffer from dimensional collapse.
Adam optimization with hard-negatives can achieve optimal geometry under certain conditions.
Abstract
For a widely-studied data model and general loss and sample-hardening functions we prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC), i.e., the class means form an Equiangular Tight Frame (ETF) and data from the same class are mapped to the same representation. We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses. In contrast to existing literature, our theoretical results for SCL do not require class-conditional independence of augmented views and work for a general loss function class that includes the widely used InfoNCE loss function. Moreover, our proofs are simpler, compact, and transparent. Similar to existing literature, our theoretical claims also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques · Microwave Imaging and Scattering Analysis
MethodsInfoNCE · Adam · Contrastive Learning
