Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning

Supasorn Suwajanakorn; Noah Snavely; Jonathan Tompson; Mohammad; Norouzi

arXiv:1807.03146·cs.CV·November 26, 2018·132 cites

Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning

Supasorn Suwajanakorn, Noah Snavely, Jonathan Tompson, Mohammad, Norouzi

PDF

Open Access 1 Repo

TL;DR

KeypointNet is an end-to-end framework that learns category-specific 3D keypoints from images without ground-truth annotations, optimizing for downstream tasks like pose estimation and outperforming supervised methods.

Contribution

The paper introduces KeypointNet, a novel end-to-end geometric reasoning framework that discovers 3D keypoints without supervision, optimized for specific tasks.

Findings

01

Discovered geometrically and semantically consistent keypoints across object views.

02

Outperformed supervised baseline in 3D pose estimation.

03

Effective on ShapeNet categories like car, chair, and plane.

Abstract

This paper presents KeypointNet, an end-to-end geometric reasoning framework to learn an optimal set of category-specific 3D keypoints, along with their detectors. Given a single image, KeypointNet extracts 3D keypoints that are optimized for a downstream task. We demonstrate this framework on 3D pose estimation by proposing a differentiable objective that seeks the optimal set of keypoints for recovering the relative pose between two views of an object. Our model discovers geometrically and semantically consistent keypoints across viewing angles and instances of an object category. Importantly, we find that our end-to-end framework using no ground-truth keypoint annotations outperforms a fully supervised baseline using the same neural network architecture on the task of pose estimation. The discovered 3D keypoints on the car, chair, and plane categories of ShapeNet are visualized at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tensorflow/models/tree/master/research/keypointnet
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging