Understanding the Role of Equivariance in Self-supervised Learning
Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie, Jegelka

TL;DR
This paper provides a theoretical understanding of equivariant self-supervised learning (E-SSL), revealing how it learns useful features through an explaining-away effect that benefits downstream tasks, and offers principles for practical design.
Contribution
It introduces an information-theoretic framework explaining how E-SSL leverages equivariance and classification synergy, guiding better design principles for future methods.
Findings
Identifies a critical explaining-away effect in E-SSL.
Shows synergy between equivariant and classification tasks.
Provides principles for practical E-SSL design.
Abstract
Contrastive learning has been a leading paradigm for self-supervised learning, but it is widely observed that it comes at the price of sacrificing useful features (\eg colors) by being invariant to data augmentations. Given this limitation, there has been a surge of interest in equivariant self-supervised learning (E-SSL) that learns features to be augmentation-aware. However, even for the simplest rotation prediction method, there is a lack of rigorous understanding of why, when, and how E-SSL learns useful features for downstream tasks. To bridge this gap between practice and theory, we establish an information-theoretic perspective to understand the generalization ability of E-SSL. In particular, we identify a critical explaining-away effect in E-SSL that creates a synergy between the equivariant and classification tasks. This synergy effect encourages models to extract…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Teaching and Learning Methods
