MVLayoutNet:3D layout reconstruction with multi-view panoramas
Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang

TL;DR
MVLayoutNet is an end-to-end neural network that combines learned monocular layout estimation with multi-view stereo for accurate 3D scene reconstruction from panoramas, outperforming existing methods.
Contribution
The paper introduces a novel MVS module with a layout cost volume and attention mechanism, enhancing 3D layout accuracy in panoramic scene reconstruction.
Findings
Outperforms state-of-the-art in depth RMSE by over 20%.
Produces coherent scene layouts enabling full scene reconstruction.
Effectively integrates layout estimation with multi-view stereo.
Abstract
We present MVLayoutNet, an end-to-end network for holistic 3D reconstruction from multi-view panoramas. Our core contribution is to seamlessly combine learned monocular layout estimation and multi-view stereo (MVS) for accurate layout reconstruction in both 3D and image space. We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry. Unlike standard MVSNet [33], our MVS module takes a newly-proposed layout cost volume, which aggregates multi-view costs at the same depth layer into corresponding layout elements. We additionally provide an attention-based scheme that guides the MVS module to focus on structural regions. Such a design considers both local pixel-level costs and global holistic information for better reconstruction. Experiments show that our method outperforms state-of-the-arts in terms of depth rmse by 21.7% and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization
