Improving Pre-trained Self-Supervised Embeddings Through Effective   Entropy Maximization

Deep Chakraborty; Yann LeCun; Tim G. J. Rudner; Erik Learned-Miller

arXiv:2411.15931·cs.LG·March 17, 2025

Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization

Deep Chakraborty, Yann LeCun, Tim G. J. Rudner, Erik Learned-Miller

PDF

Open Access

TL;DR

This paper introduces a new entropy maximization criterion (E2MC) for self-supervised learning that improves downstream task performance by focusing on low-dimensional constraints, overcoming high-dimensional entropy estimation challenges.

Contribution

The paper proposes E2MC, an effective low-dimensional entropy maximization method that enhances pre-trained SSL models with minimal additional training.

Findings

01

E2MC improves downstream performance after few epochs.

02

Continued training with E2MC outperforms other criteria.

03

Performance gains are validated through ablation studies.

Abstract

A number of different architectures and loss functions have been applied to the problem of self-supervised learning (SSL), with the goal of developing embeddings that provide the best possible pre-training for as-yet-unknown, lightly supervised downstream tasks. One of these SSL criteria is to maximize the entropy of a set of embeddings in some compact space. But the goal of maximizing the embedding entropy often depends -- whether explicitly or implicitly -- upon high dimensional entropy estimates, which typically perform poorly in more than a few dimensions. In this paper, we motivate an effective entropy maximization criterion (E2MC), defined in terms of easy-to-estimate, low-dimensional constraints. We demonstrate that using it to continue training an already-trained SSL model for only a handful of epochs leads to a consistent and, in some cases, significant improvement in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Text and Document Classification Technologies · Face and Expression Recognition

MethodsSparse Evolutionary Training