MVLayoutNet:3D layout reconstruction with multi-view panoramas

Zhihua Hu; Bo Duan; Yanfeng Zhang; Mingwei Sun; Jingwei Huang

arXiv:2112.06133·cs.CV·December 14, 2021

MVLayoutNet:3D layout reconstruction with multi-view panoramas

Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang

PDF

Open Access

TL;DR

MVLayoutNet is an end-to-end neural network that combines learned monocular layout estimation with multi-view stereo for accurate 3D scene reconstruction from panoramas, outperforming existing methods.

Contribution

The paper introduces a novel MVS module with a layout cost volume and attention mechanism, enhancing 3D layout accuracy in panoramic scene reconstruction.

Findings

01

Outperforms state-of-the-art in depth RMSE by over 20%.

02

Produces coherent scene layouts enabling full scene reconstruction.

03

Effectively integrates layout estimation with multi-view stereo.

Abstract

We present MVLayoutNet, an end-to-end network for holistic 3D reconstruction from multi-view panoramas. Our core contribution is to seamlessly combine learned monocular layout estimation and multi-view stereo (MVS) for accurate layout reconstruction in both 3D and image space. We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry. Unlike standard MVSNet [33], our MVS module takes a newly-proposed layout cost volume, which aggregates multi-view costs at the same depth layer into corresponding layout elements. We additionally provide an attention-based scheme that guides the MVS module to focus on structural regions. Such a design considers both local pixel-level costs and global holistic information for better reconstruction. Experiments show that our method outperforms state-of-the-arts in terms of depth rmse by 21.7% and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization