CYBORGS: Contrastively Bootstrapping Object Representations by Grounding   in Segmentation

Renhao Wang; Hang Zhao; Yang Gao

arXiv:2203.09343·cs.CV·August 17, 2022

CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation

Renhao Wang, Hang Zhao, Yang Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces CYBORGS, an end-to-end framework that jointly learns object representations and segmentation masks through contrastive learning, improving pretraining on complex scenes for better transfer to downstream tasks.

Contribution

It presents a novel joint learning approach that iteratively improves segmentation masks and object representations using contrastive loss grounded in segmentation.

Findings

01

Robust transfer of learned representations to downstream tasks.

02

Improved segmentation quality during pretraining.

03

Enhanced performance in classification, detection, and segmentation.

Abstract

Many recent approaches in contrastive learning have worked to close the gap between pretraining on iconic images like ImageNet and pretraining on complex scenes like COCO. This gap exists largely because commonly used random crop augmentations obtain semantically inconsistent content in crowded scene images of diverse objects. Previous works use preprocessing pipelines to localize salient objects for improved cropping, but an end-to-end solution is still elusive. In this work, we propose a framework which accomplishes this goal via joint learning of representations and segmentation. We leverage segmentation masks to train a model with a mask-dependent contrastive loss, and use the partially trained model to bootstrap better masks. By iterating between these two components, we ground the contrastive updates in segmentation information, and simultaneously improve segmentation throughout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

renwang435/cyborgs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsContrastive Learning