Learning Dense Object Descriptors from Multiple Views for Low-shot   Category Generalization

Stefan Stojanov; Anh Thai; Zixuan Huang; James M. Rehg

arXiv:2211.15059·cs.CV·November 29, 2022

Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization

Stefan Stojanov, Anh Thai, Zixuan Huang, James M. Rehg

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DOPE, a self-supervised method for learning dense object representations from multiple views, enabling low-shot category recognition without category labels, and outperforming existing baselines.

Contribution

The work presents a novel self-supervised approach to learn dense discriminative object features from multiple views without category labels, facilitating low-shot recognition.

Findings

01

DOPE achieves competitive low-shot classification performance.

02

It outperforms supervised and self-supervised baselines.

03

The method works with sparse depths, masks, and known camera parameters.

Abstract

A hallmark of the deep learning era for computer vision is the successful use of large-scale labeled datasets to train feature representations for tasks ranging from object recognition and semantic segmentation to optical flow estimation and novel view synthesis of 3D scenes. In this work, we aim to learn dense discriminative object representations for low-shot category recognition without requiring any category labels. To this end, we propose Deep Object Patch Encodings (DOPE), which can be trained from multiple views of object instances without any category or semantic object part labels. To train DOPE, we assume access to sparse depths, foreground masks and known cameras, to obtain pixel-level correspondences between views of an object, and use this to formulate a self-supervised learning task to learn discriminative object patches. We find that DOPE can directly be used for low-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rehg-lab/dope_selfsup
pytorchOfficial

Videos

Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications