IConE: Batch Independent Collapse Prevention for Self-Supervised Representation Learning
Konstantinos Almpanakis, Anna Kreshuk

TL;DR
IConE introduces a dataset-level embedding approach for self-supervised learning that prevents collapse independently of batch size, enabling effective training on small or imbalanced datasets, especially in biomedical applications.
Contribution
IConE decouples collapse prevention from batch size by using global auxiliary embeddings, improving stability and robustness in small-batch and imbalanced data regimes.
Findings
Outperforms contrastive and non-contrastive baselines in small-batch regimes
Maintains high intrinsic dimensionality in learned representations
Demonstrates robustness to severe class imbalance
Abstract
Self-supervised learning (SSL) has revolutionized representation learning, with Joint-Embedding Architectures (JEAs) emerging as an effective approach for capturing semantic features. Existing JEAs rely on implicit or explicit batch interaction -- via negative sampling or statistical regularization -- to prevent representation collapse. This reliance becomes problematic in regimes where batch sizes must be small, such as high-dimensional scientific data, where memory constraints and class imbalance make large, well-balanced batches infeasible. We introduce IConE (Instance-Contrasted Embeddings), a framework that decouples collapse prevention from the training batch size. Rather than enforcing diversity through batch statistics, IConE maintains a global set of learnable auxiliary instance embeddings regularized by an explicit diversity objective. This transfers the anti-collapse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare
