Bag of Image Patch Embedding Behind the Success of Self-Supervised Learning
Yubei Chen, Adrien Bardes, Zengyi Li, Yann LeCun

TL;DR
This paper reveals that self-supervised learning primarily captures image patch co-occurrence, and demonstrates that patch-based aggregation methods can match or outperform traditional SSL approaches, providing insights into the underlying principles of image representation learning.
Contribution
It introduces BagSSL, a patch aggregation approach that improves SSL performance and offers a formal connection to co-occurrence modeling, enhancing understanding of SSL mechanisms.
Findings
BagSSL achieves 62% top-1 accuracy on ImageNet with 32x32 patches.
Patch aggregation can enhance state-of-the-art SSL methods.
Local patch representations preserve locality even at invariance scales.
Abstract
Self-supervised learning (SSL) has recently achieved tremendous empirical advancements in learning image representation. However, our understanding of the principle behind learning such a representation is still limited. This work shows that joint-embedding SSL approaches primarily learn a representation of image patches, which reflects their co-occurrence. Such a connection to co-occurrence modeling can be established formally, and it supplements the prevailing invariance perspective. We empirically show that learning a representation for fixed-scale patches and aggregating local patch representations as the image representation achieves similar or even better results than the baseline methods. We denote this process as BagSSL. Even with 32x32 patch representation, BagSSL achieves 62% top-1 linear probing accuracy on ImageNet. On the other hand, with a multi-scale pretrained model, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · Mycobacterium research and diagnosis
