Learning a Multi-View Stereo Machine
Abhishek Kar, Christian H\"ane, Jitendra Malik

TL;DR
This paper introduces a differentiable, end-to-end learnable multi-view stereo system that leverages 3D geometry for improved 3D reconstruction from fewer images, including single-image scenarios.
Contribution
It presents a novel learning framework that incorporates geometric operations like feature projection and unprojection, enabling efficient 3D reconstruction with fewer images and better surface completion.
Findings
Outperforms classical methods on ShapeNet dataset
Achieves accurate 3D reconstructions from a single image
Demonstrates benefits over recent learning-based approaches
Abstract
We present a learnt system for multi-view stereopsis. In contrast to recent learning based methods for 3D reconstruction, we leverage the underlying 3D geometry of the problem through feature projection and unprojection along viewing rays. By formulating these operations in a differentiable manner, we are able to learn the system end-to-end for the task of metric 3D reconstruction. End-to-end learning allows us to jointly reason about shape priors while conforming geometric constraints, enabling reconstruction from much fewer images (even a single image) than required by classical approaches as well as completion of unseen surfaces. We thoroughly evaluate our approach on the ShapeNet dataset and demonstrate the benefits over classical approaches as well as recent learning based methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Numerical Analysis Techniques
