Hard-Negative Sampling for Contrastive Learning: Optimal Representation   Geometry and Neural- vs Dimensional-Collapse

Ruijie Jiang; Thuan Nguyen; Shuchin Aeron; Prakash Ishwar

arXiv:2311.05139·cs.LG·May 8, 2025·2 cites

Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse

Ruijie Jiang, Thuan Nguyen, Shuchin Aeron, Prakash Ishwar

PDF

Open Access 1 Repo

TL;DR

This paper provides theoretical and empirical insights into how hard-negative sampling influences the geometry of learned representations in contrastive learning, demonstrating conditions under which neural collapse occurs versus dimensional collapse.

Contribution

It offers the first unified theoretical analysis of supervised and unsupervised contrastive learning that does not rely on class-conditional independence, highlighting the role of hard-negative sampling and feature normalization.

Findings

01

Hard-negative sampling promotes neural collapse with feature normalization.

02

Without normalization, learned representations tend to suffer from dimensional collapse.

03

Adam optimization with hard-negatives can achieve optimal geometry under certain conditions.

Abstract

For a widely-studied data model and general loss and sample-hardening functions we prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC), i.e., the class means form an Equiangular Tight Frame (ETF) and data from the same class are mapped to the same representation. We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses. In contrast to existing literature, our theoretical results for SCL do not require class-conditional independence of augmented views and work for a general loss function class that includes the widely used InfoNCE loss function. Moreover, our proofs are simpler, compact, and transparent. Similar to existing literature, our theoretical claims also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rjiang03/hcl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques · Microwave Imaging and Scattering Analysis

MethodsInfoNCE · Adam · Contrastive Learning