Deep Interpretable Non-Rigid Structure from Motion
Chen Kong, Simon Lucey

TL;DR
This paper introduces a deep neural network for non-rigid structure from motion that is interpretable, scalable, and robust, capable of recovering 3D shapes from 2D images without requiring extensive ground-truth data.
Contribution
It presents a novel deep neural network model that is mathematically interpretable as a sparse dictionary learning problem, enabling handling of complex shapes and large-scale problems.
Findings
Outperforms state-of-the-art methods in accuracy and robustness
Demonstrates strong generalization to unseen data
Can recover 3D shape from a single image without 3D ground-truth
Abstract
All current non-rigid structure from motion (NRSfM) algorithms are limited with respect to: (i) the number of images, and (ii) the type of shape variability they can handle. This has hampered the practical utility of NRSfM for many applications within vision. In this paper we propose a novel deep neural network to recover camera poses and 3D points solely from an ensemble of 2D image coordinates. The proposed neural network is mathematically interpretable as a multi-layer block sparse dictionary learning problem, and can handle problems of unprecedented scale and shape complexity. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works. The considerable model capacity of our approach affords remarkable generalization to unseen data. We propose a quality measure (based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image and Object Detection Techniques · 3D Surveying and Cultural Heritage
