Generic 3D Representation via Pose Estimation and Matching

Amir R. Zamir; Tilman Wekel; Pulkit Argrawal; Colin Weil; Jitendra; Malik; Silvio Savarese

arXiv:1710.08247·cs.CV·October 24, 2017

Generic 3D Representation via Pose Estimation and Matching

Amir R. Zamir, Tilman Wekel, Pulkit Argrawal, Colin Weil, Jitendra, Malik, Silvio Savarese

PDF

1 Repo

TL;DR

This paper introduces a method to learn a versatile 3D representation by training on foundational tasks like pose estimation and feature matching, enabling generalization to new 3D tasks and achieving state-of-the-art results.

Contribution

It presents a novel multi-task learning approach for 3D representation that generalizes to multiple tasks without fine-tuning and provides a large-scale dataset for further research.

Findings

01

Representation generalizes to novel 3D tasks without fine-tuning

02

Achieves state-of-the-art wide baseline feature matching

03

Performs camera pose estimation comparable to humans

Abstract

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross-modality pose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amir32002/3D_Street_View
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.