Masked Momentum Contrastive Learning for Zero-shot Semantic   Understanding

Jiantao Wu; Shentong Mo; Muhammad Awais; Sara Atito and; Zhenhua Feng; Josef Kittler

arXiv:2308.11448·cs.CV·August 23, 2023

Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding

Jiantao Wu, Shentong Mo, Muhammad Awais, Sara Atito and, Zhenhua Feng, Josef Kittler

PDF

Open Access

TL;DR

This paper introduces MMC, a self-supervised learning method that enhances zero-shot semantic segmentation by improving the discriminative power of vision transformers without finetuning.

Contribution

The paper proposes a novel SSP approach called MMC that combines masked image modeling, momentum self-distillation, and global contrast to improve zero-shot segmentation.

Findings

01

MMC significantly reduces intra- and inter-object similarity overlap.

02

MMC achieves top-tier zero-shot segmentation results across datasets.

03

The approach enhances discriminative representations of SSP ViTs.

Abstract

Self-supervised pretraining (SSP) has emerged as a popular technique in machine learning, enabling the extraction of meaningful feature representations without labelled data. In the realm of computer vision, pretrained vision transformers (ViTs) have played a pivotal role in advancing transfer learning. Nonetheless, the escalating cost of finetuning these large models has posed a challenge due to the explosion of model size. This study endeavours to evaluate the effectiveness of pure self-supervised learning (SSL) techniques in computer vision tasks, obviating the need for finetuning, with the intention of emulating human-like capabilities in generalisation and recognition of unseen objects. To this end, we propose an evaluation protocol for zero-shot segmentation based on a prompting patch. Given a point on the target object as a prompt, the algorithm calculates the similarity map…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM