On the Generalization and Causal Explanation in Self-Supervised Learning

Wenwen Qiang; Zeen Song; Ziyin Gu; Jiangmeng Li; Changwen Zheng,; Fuchun Sun; Hui Xiong

arXiv:2410.00772·cs.CV·October 2, 2024

On the Generalization and Causal Explanation in Self-Supervised Learning

Wenwen Qiang, Zeen Song, Ziyin Gu, Jiangmeng Li, Changwen Zheng,, Fuchun Sun, Hui Xiong

PDF

Open Access 1 Repo

TL;DR

This paper investigates overfitting in self-supervised learning, identifies indicators of overfitting, and proposes a novel method called UMM to improve generalization by aligning feature distributions across layers.

Contribution

The paper introduces UMM, a new plug-and-play mechanism that mitigates overfitting in SSL models through layer distribution alignment and causal analysis.

Findings

01

Overfitting occurs abruptly in later layers and epochs.

02

Coding rate reduction effectively measures overfitting.

03

UMM improves SSL generalization on downstream tasks.

Abstract

Self-supervised learning (SSL) methods learn from unlabeled data and achieve high generalization performance on downstream tasks. However, they may also suffer from overfitting to their training data and lose the ability to adapt to new tasks. To investigate this phenomenon, we conduct experiments on various SSL methods and datasets and make two observations: (1) Overfitting occurs abruptly in later layers and epochs, while generalizing features are learned in early layers for all epochs; (2) Coding rate reduction can be used as an indicator to measure the degree of overfitting in SSL models. Based on these observations, we propose Undoing Memorization Mechanism (UMM), a plug-and-play method that mitigates overfitting of the pre-trained feature extractor by aligning the feature distributions of the early and the last layers to maximize the coding rate reduction of the last layer output.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zeensong/umm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment