Selfie: Self-supervised Pretraining for Image Embedding
Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le

TL;DR
Selfie introduces a self-supervised pretraining method for image embeddings that improves performance and stability across various datasets and data regimes by predicting masked patches using contrastive learning.
Contribution
The paper presents Selfie, a novel self-supervised pretraining approach for images that generalizes masked language modeling to continuous data using contrastive predictive coding.
Findings
Significant accuracy improvements on CIFAR-10 and ImageNet benchmarks.
Enhanced training stability and reduced variance in low-data regimes.
Improved performance of ResNet-50 with minimal labeled data.
Abstract
We introduce a pretraining technique called Selfie, which stands for SELFie supervised Image Embedding. Selfie generalizes the concept of masked language modeling of BERT (Devlin et al., 2019) to continuous data, such as images, by making use of the Contrastive Predictive Coding loss (Oord et al., 2018). Given masked-out patches in an input image, our method learns to select the correct patch, among other "distractor" patches sampled from the same image, to fill in the masked location. This classification objective sidesteps the need for predicting exact pixel values of the target patches. The pretraining architecture of Selfie includes a network of convolutional blocks to process patches followed by an attention pooling network to summarize the content of unmasked patches before predicting masked ones. During finetuning, we reuse the convolutional weights found by pretraining. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsLinear Layer · InfoNCE · Contrastive Predictive Coding · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam
