Engineering the Neural Collapse Geometry of Supervised-Contrastive Loss
Jaidev Gill, Vala Vakilian, Christos Thrampoulidis

TL;DR
This paper explores how modifying supervised-contrastive loss with prototypes influences the geometry of learned embeddings, connecting it to cross-entropy loss and demonstrating improved feature alignment through experiments.
Contribution
It introduces methods to engineer embedding geometry in supervised-contrastive learning by incorporating prototypes, linking it to fixed classifier models and normalized embeddings.
Findings
Prototypes in each batch align embeddings with prototype geometry.
Large number of prototypes relates SCL to cross-entropy loss with fixed classifiers.
Empirical validation on vision datasets confirms the effectiveness of the approach.
Abstract
Supervised-contrastive loss (SCL) is an alternative to cross-entropy (CE) for classification tasks that makes use of similarities in the embedding space to allow for richer representations. In this work, we propose methods to engineer the geometry of these learnt feature embeddings by modifying the contrastive loss. In pursuit of adjusting the geometry we explore the impact of prototypes, fixed embeddings included during training to alter the final feature geometry. Specifically, through empirical findings, we demonstrate that the inclusion of prototypes in every batch induces the geometry of the learnt embeddings to align with that of the prototypes. We gain further insights by considering a limiting scenario where the number of prototypes far outnumber the original batch size. Through this, we establish a connection to cross-entropy (CE) loss with a fixed classifier and normalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
MethodsALIGN
