SdAE: Self-distillated Masked Autoencoder

Yabo Chen; Yuchen Liu; Dongsheng Jiang; Xiaopeng Zhang; Wenrui Dai,; Hongkai Xiong; Qi Tian

arXiv:2208.00449·cs.CV·August 2, 2022

SdAE: Self-distillated Masked Autoencoder

Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai,, Hongkai Xiong, Qi Tian

PDF

Open Access 1 Repo

TL;DR

SdAE introduces a self-distillation masked autoencoder that enhances representation learning by combining a student encoder-decoder with a teacher producing latent representations, utilizing multi-fold masking for improved performance and efficiency.

Contribution

The paper proposes a novel self-distilled masked autoencoder architecture with multi-fold masking strategy, improving pre-training efficiency and downstream task performance.

Findings

01

Achieves 84.1% ImageNet accuracy after 300 epochs pre-training.

02

Surpasses other methods in segmentation and detection benchmarks.

03

Reduces computational complexity with multi-fold masking.

Abstract

With the development of generative-based self-supervised learning (SSL) approaches like BeiT and MAE, how to learn good representations by masking random patches of the input image and reconstructing the missing information has grown in concern. However, BeiT and PeCo need a "pre-pretraining" stage to produce discrete codebooks for masked patches representing. MAE does not require a pre-training codebook process, but setting pixels as reconstruction targets may introduce an optimization gap between pre-training and downstream tasks that good reconstruction quality may not always lead to the high descriptive capability for the model. Considering the above issues, in this paper, we propose a simple Self-distillated masked AutoEncoder network, namely SdAE. SdAE consists of a student branch using an encoder-decoder structure to reconstruct the missing information, and a teacher branch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abrahamyabo/sdae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Digital Media Forensic Detection · Domain Adaptation and Few-Shot Learning

MethodsMasked autoencoder · Stacked Denoising Autoencoder