Shape and Viewpoint without Keypoints

Shubham Goel; Angjoo Kanazawa; Jitendra Malik

arXiv:2007.10982·cs.CV·July 22, 2020

Shape and Viewpoint without Keypoints

Shubham Goel, Angjoo Kanazawa, Jitendra Malik

PDF

TL;DR

This paper introduces an unsupervised learning framework for 3D shape, pose, and texture reconstruction from single images without ground truth annotations, utilizing a novel camera distribution representation called "camera-multiplex."

Contribution

It proposes a new approach called U-CMR that predicts diverse 3D shapes and camera viewpoints using a set of hypotheses, advancing unsupervised 3D reconstruction methods.

Findings

01

Achieves state-of-the-art camera prediction results.

02

Learns diverse shapes and textures without keypoint or 3D ground truth.

03

Demonstrates effectiveness on multiple datasets.

Abstract

We present a learning framework that learns to recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision. We approach this highly under-constrained problem in a "analysis by synthesis" framework where the goal is to predict the likely shape, texture and camera viewpoint that could produce the image with various learned category-specific priors. Our particular contribution in this paper is a representation of the distribution over cameras, which we call "camera-multiplex". Instead of picking a point estimate, we maintain a set of camera hypotheses that are optimized during training to best explain the image given the current shape and texture. We call our approach Unsupervised Category-Specific Mesh Reconstruction (U-CMR), and present qualitative and quantitative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.