Object discovery and representation networks

Olivier J. H\'enaff; Skanda Koppula; Evan Shelhamer; Daniel Zoran,; Andrew Jaegle; Andrew Zisserman; Jo\~ao Carreira; Relja Arandjelovi\'c

arXiv:2203.08777·cs.CV·July 28, 2022

Object discovery and representation networks

Olivier J. H\'enaff, Skanda Koppula, Evan Shelhamer, Daniel Zoran,, Andrew Jaegle, Andrew Zisserman, Jo\~ao Carreira, Relja Arandjelovi\'c

PDF

1 Repo

TL;DR

Odin is a self-supervised learning framework that automatically discovers image structures like objects and segments, leading to improved transfer learning performance across various vision tasks without relying on manual annotations.

Contribution

The paper introduces Odin, a novel self-supervised approach that jointly learns object discovery and representation, eliminating the need for handcrafted image segmentations or specialized augmentations.

Findings

01

Achieves state-of-the-art transfer results on COCO, PASCAL, and Cityscapes.

02

Surpasses supervised pre-training for video segmentation on DAVIS.

03

Demonstrates robustness and generality over prior SSL methods.

Abstract

The promise of self-supervised learning (SSL) is to leverage large amounts of unlabeled data to solve complex tasks. While there has been excellent progress with simple, image-level learning, recent methods have shown the advantage of including knowledge of image structure. However, by introducing hand-crafted image segmentations to define regions of interest, or specialized augmentation strategies, these methods sacrifice the simplicity and generality that makes SSL so powerful. Instead, we propose a self-supervised learning paradigm that discovers this image structure by itself. Our method, Odin, couples object discovery and representation networks to discover meaningful image segmentations without any supervision. The resulting learning paradigm is simpler, less brittle, and more general, and achieves state-of-the-art transfer learning results for object detection and instance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

theostoican/Guided-Research/blob/main/finetuning_odin_resnet.ipynb
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.