Refine and Represent: Region-to-Object Representation Learning

Akash Gokul; Konstantinos Kallidromitis; Shufan Li; Yusuke Kato,; Kazuki Kozuka; Trevor Darrell; and Colorado J Reed

arXiv:2208.11821·cs.CV·December 22, 2022·1 cites

Refine and Represent: Region-to-Object Representation Learning

Akash Gokul, Konstantinos Kallidromitis, Shufan Li, Yusuke Kato,, Kazuki Kozuka, Trevor Darrell, and Colorado J Reed

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces R2O, a unified self-supervised pretraining method that refines regions into object-centric masks, leading to state-of-the-art results in various dense prediction tasks and unsupervised object segmentation.

Contribution

R2O unifies region-based and object-centric pretraining through a dynamic refinement process and a curriculum, improving dense prediction and segmentation performance.

Findings

01

State-of-the-art semantic segmentation on PASCAL VOC and Cityscapes.

02

Improved instance segmentation on MS COCO.

03

Superior unsupervised object segmentation on Caltech-UCSD Birds dataset.

Abstract

Recent works in self-supervised learning have demonstrated strong performance on scene-level dense prediction tasks by pretraining with object-centric or region-based correspondence objectives. In this paper, we present Region-to-Object Representation Learning (R2O) which unifies region-based and object-centric pretraining. R2O operates by training an encoder to dynamically refine region-based segments into object-centric masks and then jointly learns representations of the contents within the mask. R2O uses a "region refinement module" to group small image regions, generated using a region-level prior, into larger regions which tend to correspond to objects by clustering region-level features. As pretraining progresses, R2O follows a region-to-object curriculum which encourages learning region-level features early on and gradually progresses to train object-centric representations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkallidromitis/r2o
pytorchOfficial

Models

🤗
KonstantinosKK/r2o
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques