LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware   Transformer Network

Zhigang Jiang; Zhongzheng Xiang; Jinhua Xu; Ming Zhao

arXiv:2203.01824·cs.CV·March 28, 2022·1 cites

LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network

Zhigang Jiang, Zhongzheng Xiang, Jinhua Xu, Ming Zhao

PDF

Open Access 1 Repo

TL;DR

LGT-Net is a novel Transformer-based network that enhances indoor panoramic room layout estimation by integrating geometry-aware features and a specialized loss function, outperforming existing methods on benchmark datasets.

Contribution

The paper introduces LGT-Net with a SWG-Transformer architecture and a geometry-aware loss, improving 3D room layout estimation accuracy using omnidirectional geometry cues.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Effectively models local and global geometry relations.

03

Utilizes horizon-depth and room height for comprehensive geometry awareness.

Abstract

3D room layout estimation by a single panorama using deep neural networks has made great progress. However, previous approaches can not obtain efficient geometry awareness of room layout with the only latitude of boundaries or horizon-depth. We present that using horizon-depth along with room height can obtain omnidirectional-geometry awareness of room layout in both horizontal and vertical directions. In addition, we propose a planar-geometry aware loss function with normals and gradients of normals to supervise the planeness of walls and turning of corners. We propose an efficient network, LGT-Net, for room layout estimation, which contains a novel Transformer architecture called SWG-Transformer to model geometry relations. SWG-Transformer consists of (Shifted) Window Blocks and Global Blocks to combine the local and global geometry relations. Moreover, we design a novel relative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhigangjiang/LGT-Net
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · 3D Surveying and Cultural Heritage

MethodsLinear Layer · Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Attentive Walk-Aggregating Graph Neural Network · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block