How Mask Matters: Towards Theoretical Understandings of Masked   Autoencoders

Qi Zhang; Yifei Wang; Yisen Wang

arXiv:2210.08344·cs.LG·March 28, 2023·20 cites

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders

Qi Zhang, Yifei Wang, Yisen Wang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper provides a theoretical framework for understanding Masked Autoencoders (MAE), linking them to contrastive learning, analyzing the impact of mask ratio, and proposing a new loss to improve their performance and address dimensional collapse.

Contribution

It establishes a theoretical connection between MAE and contrastive learning, introduces downstream guarantees, and proposes U-MAE to enhance performance and stability.

Findings

01

MAE implicitly aligns mask-induced positive pairs

02

U-MAE effectively addresses dimensional collapse

03

Significant improvements on CIFAR-10, ImageNet-100, and ImageNet-1K

Abstract

Masked Autoencoders (MAE) based on a reconstruction task have risen to be a promising paradigm for self-supervised learning (SSL) and achieve state-of-the-art performance across different benchmark datasets. However, despite its impressive empirical success, there is still limited theoretical understanding of it. In this paper, we propose a theoretical understanding of how masking matters for MAE to learn meaningful features. We establish a close connection between MAE and contrastive learning, which shows that MAE implicit aligns the mask-induced positive pairs. Built upon this connection, we develop the first downstream guarantees for MAE methods, and analyze the effect of mask ratio. Besides, as a result of the implicit alignment, we also point out the dimensional collapse issue of MAE, and propose a Uniformity-enhanced MAE (U-MAE) loss that can effectively address this issue and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · AI in cancer detection

MethodsMasked autoencoder