MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

Sara Atito; Muhammad Awais; Ammarah Farooq; Zhenhua Feng; Josef; Kittler

arXiv:2111.15340·cs.CV·December 1, 2021

MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

Sara Atito, Muhammad Awais, Ammarah Farooq, Zhenhua Feng, Josef, Kittler

PDF

Open Access

TL;DR

This paper introduces MC-SSL0.0, a novel self-supervised learning framework that models multiple concepts within images, surpassing existing SSL methods and even supervised transfer learning in multi-label and multi-class vision tasks.

Contribution

The paper proposes MC-SSL0.0, a new SSL framework that effectively captures all concepts in an image using group masked model learning and pseudo-concept learning with a momentum encoder.

Findings

01

MC-SSL0.0 outperforms existing SSL methods on multi-label classification.

02

MC-SSL0.0 surpasses supervised transfer learning in experiments.

03

The approach effectively models multiple concepts in images without labels.

Abstract

Self-supervised pretraining is the method of choice for natural language processing models and is rapidly gaining popularity in many vision tasks. Recently, self-supervised pretraining has shown to outperform supervised pretraining for many downstream vision applications, marking a milestone in the area. This superiority is attributed to the negative impact of incomplete labelling of the training images, which convey multiple concepts, but are annotated using a single dominant class label. Although Self-Supervised Learning (SSL), in principle, is free of this limitation, the choice of pretext task facilitating SSL is perpetuating this shortcoming by driving the learning process towards a single concept output. This study aims to investigate the possibility of modelling all the concepts present in an image without using labels. In this aspect the proposed SSL frame-work MC-SSL0.0 is a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques