Understanding Self-Supervised Pretraining with Part-Aware Representation   Learning

Jie Zhu; Jiyang Qi; Mingyu Ding; Xiaokang Chen; Ping Luo; Xinggang; Wang; Wenyu Liu; Leye Wang; Jingdong Wang

arXiv:2301.11915·cs.CV·January 24, 2024·1 cites

Understanding Self-Supervised Pretraining with Part-Aware Representation Learning

Jie Zhu, Jiyang Qi, Mingyu Ding, Xiaokang Chen, Ping Luo, Xinggang, Wang, Wenyu Liu, Leye Wang, Jingdong Wang

PDF

Open Access 1 Repo

TL;DR

This paper investigates how self-supervised pretraining methods learn part-aware representations, revealing their strengths in part-level recognition and the complementary nature of contrastive learning and masked image modeling.

Contribution

It provides a theoretical explanation of part-to-whole and part-to-part learning in self-supervised methods and empirically compares their effectiveness on recognition tasks.

Findings

01

Self-supervised models excel at part-level recognition.

02

Contrastive learning and masked image modeling are complementary.

03

Fully-supervised models outperform self-supervised ones at object-level recognition.

Abstract

In this paper, we are interested in understanding self-supervised pretraining through studying the capability that self-supervised representation pretraining methods learn part-aware representations. The study is mainly motivated by that random views, used in contrastive learning, and random masked (visible) patches, used in masked image modeling, are often about object parts. We explain that contrastive learning is a part-to-whole task: the projection layer hallucinates the whole object representation from the object part representation learned from the encoder, and that masked image modeling is a part-to-part task: the masked patches of the object are hallucinated from the visible patches. The explanation suggests that the self-supervised pretrained encoder is required to understand the object part. We empirically compare the off-the-shelf encoders pretrained with several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiepku/understand-ssl-part-aware
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · 3D Surveying and Cultural Heritage

MethodsContrastive Learning