Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks
Linus Ericsson, Henry Gouk, Timothy M. Hospedales

TL;DR
This paper investigates how invariance learned by self-supervised models affects their transferability to various downstream tasks, revealing that task-specific invariance requirements are crucial for optimal performance.
Contribution
It demonstrates that different downstream tasks require different invariance properties and proposes a simple method to fuse representations for improved transferability.
Findings
Invariance learned by contrastive methods transfers to real-world changes.
Different tasks benefit from opposite invariance properties.
Fusing representations with complementary invariances improves transferability.
Abstract
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images. A wealth of effective new methods based on instance matching rely on data-augmentation to drive learning, and these have reached a rough agreement on an augmentation scheme that optimises popular recognition benchmarks. However, there is strong reason to suspect that different tasks in computer vision require features to encode different (in)variances, and therefore likely require different augmentation strategies. In this paper, we measure the invariances learned by contrastive methods and confirm that they do learn invariance to the augmentations used and further show that this invariance largely transfers to related real-world changes in pose and lighting. We show that learned invariances strongly affect downstream task performance and confirm that different downstream tasks benefit from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
