Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

Johannes Lehner; Benedikt Alkin; Andreas F\"urst; Elisabeth; Rumetshofer; Lukas Miklautz; Sepp Hochreiter

arXiv:2304.10520·cs.CV·September 15, 2023·1 cites

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

Johannes Lehner, Benedikt Alkin, Andreas F\"urst, Elisabeth, Rumetshofer, Lukas Miklautz, Sepp Hochreiter

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MAE-CT, a contrastive tuning method that enhances masked autoencoders to produce more object-focused, semantically clustered features suitable for downstream classification tasks without extensive labeled data.

Contribution

The paper proposes MAE-CT, a novel sequential contrastive tuning approach that improves masked autoencoders by inducing semantic object clusters without labels, with minimal additional computation.

Findings

01

MAE-CT outperforms previous self-supervised methods in classification tasks.

02

Achieves state-of-the-art linear probing accuracy of 82.2% with ViT-H/16.

03

Requires only minimal data augmentations and 10% additional computation.

Abstract

Masked Image Modeling (MIM) methods, like Masked Autoencoders (MAE), efficiently learn a rich representation of the input. However, for adapting to downstream tasks, they require a sufficient amount of labeled data since their rich features code not only objects but also less relevant image background. In contrast, Instance Discrimination (ID) methods focus on objects. In this work, we study how to combine the efficiency and scalability of MIM with the ability of ID to perform downstream classification in the absence of large amounts of labeled data. To this end, we introduce Masked Autoencoder Contrastive Tuning (MAE-CT), a sequential approach that utilizes the implicit clustering of the Nearest Neighbor Contrastive Learning (NNCLR) objective to induce abstraction in the topmost layers of a pre-trained MAE. MAE-CT tunes the rich features such that they form semantic clusters of objects…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ml-jku/mae-ct
pytorchOfficial

Videos

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Masked autoencoder · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Absolute Position Encodings · Residual Connection · k-Nearest Neighbors · Softmax