ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
Dinesh Jayaraman, Ruohan Gao, and Kristen Grauman

TL;DR
ShapeCodes introduces a self-supervised method that learns 3D shape representations from single images by predicting all views of an object, enabling effective shape understanding and recognition without manual labels.
Contribution
The paper presents a novel self-supervised approach that embeds 3D shape information into image features by lifting views to viewgrids, outperforming existing unsupervised methods.
Findings
Successfully predicts unseen views of objects.
Learns shape primitives and semantic regularities.
Outperforms existing unsupervised feature learning methods.
Abstract
We introduce an unsupervised feature learning approach that embeds 3D shape information into a single-view image representation. The main idea is a self-supervised training objective that, given only a single 2D image, requires all unseen views of the object to be predictable from learned features. We implement this idea as an encoder-decoder convolutional neural network. The network maps an input image of an unknown category and unknown viewpoint to a latent space, from which a deconvolutional decoder can best "lift" the image to its complete viewgrid showing the object from all viewing angles. Our class-agnostic training procedure encourages the representation to capture fundamental shape primitives and semantic regularities in a data-driven manner---without manual semantic labels. Our results on two widely-used shape datasets show 1) our approach successfully learns to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Human Pose and Action Recognition · 3D Surveying and Cultural Heritage
