TL;DR
SurfaceNet is an end-to-end 3D neural network that directly infers 3D models from multiview images, learning photo-consistency and geometric relations simultaneously for improved stereopsis.
Contribution
It introduces a fully 3D convolutional network that encodes camera parameters with images in a voxel space for multiview 3D reconstruction.
Findings
Effective on large-scale DTU benchmark
Outperforms traditional multiview stereopsis methods
Learns both photo-consistency and geometry end-to-end
Abstract
This paper proposes an end-to-end learning framework for multiview stereopsis. We term the network SurfaceNet. It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model. The key advantage of the framework is that both photo-consistency as well geometric relations of the surface structure can be directly learned for the purpose of multiview stereopsis in an end-to-end fashion. SurfaceNet is a fully 3D convolutional network which is achieved by encoding the camera parameters together with the images in a 3D voxel representation. We evaluate SurfaceNet on the large-scale DTU benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
