GMML is All you Need

Sara Atito; Muhammad Awais; Josef Kittler

arXiv:2205.14986·cs.CV·May 31, 2022

GMML is All you Need

Sara Atito, Muhammad Awais, Josef Kittler

PDF

1 Repo 1 Models

TL;DR

GMML is a novel self-supervised learning method for vision transformers that effectively captures contextual information by manipulating groups of tokens, without needing complex training tricks or large batch sizes.

Contribution

It introduces GMML, a self-supervised pretraining approach that enhances context extraction in vision transformers without requiring momentum encoders or large batches.

Findings

01

GMML outperforms existing SSL methods on vision tasks.

02

It simplifies training by removing the need for momentum encoders.

03

GMML effectively captures semantic context in images.

Abstract

Vision transformers have generated significant interest in the computer vision community because of their flexibility in exploiting contextual information, whether it is sharply confined local, or long range global. However, they are known to be data hungry. This has motivated the research in self-supervised transformer pretraining, which does not need to decode the semantic information conveyed by labels to link it to the image properties, but rather focuses directly on extracting a concise representation of the image data that reflects the notion of similarity, and is invariant to nuisance factors. The key vehicle for the self-learning process used by the majority of self-learning methods is the generation of multiple views of the training data and the creation of pretext tasks which use these views to define the notion of image similarity, and data integrity. However, this approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sara-ahmed/gmml
pytorchOfficial

Models

🤗
erow/GMML
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSelf-Learning