PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation
Haiyan Wang, Will Hutchcroft, Yuguang Li, Zhiqiang Wan, Ivaylo, Boyadzhiev, Yingli Tian, Sing Bing Kang

TL;DR
This paper introduces PSMNet, a deep learning approach for accurate room layout estimation from stereo panoramas, leveraging a novel transformer and projection layer to improve performance in complex spaces.
Contribution
The paper presents PSMNet, a new end-to-end network with a transformer and projection layer that effectively estimates room layouts from stereo panoramas, handling noisy poses.
Findings
Outperforms state-of-the-art layout estimators
Effective in large and complex room spaces
Handles noisy pose data
Abstract
In this paper, we propose a new deep learning-based method for estimating room layout given a pair of 360 panoramas. Our system, called Position-aware Stereo Merging Network or PSMNet, is an end-to-end joint layout-pose estimator. PSMNet consists of a Stereo Pano Pose (SP2) transformer and a novel Cross-Perspective Projection (CP2) layer. The stereo-view SP2 transformer is used to implicitly infer correspondences between views, and can handle noisy poses. The pose-aware CP2 layer is designed to render features from the adjacent view to the anchor (reference) view, in order to perform view fusion and estimate the visible layout. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art layout estimators, especially for large and complex room spaces.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization
