Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries

Yuanwen Yue; Theodora Kontogianni; Konrad Schindler; Francis Engelmann

arXiv:2211.15658·cs.CV·March 29, 2023

Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries

Yuanwen Yue, Theodora Kontogianni, Konrad Schindler, Francis Engelmann

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel Transformer-based approach for 2D floorplan reconstruction from 3D scans, enabling end-to-end prediction of room polygons with improved accuracy and speed over previous multi-stage methods.

Contribution

It formulates floorplan reconstruction as a single-stage structured prediction task using a new Transformer architecture with two-level queries, eliminating the need for hand-crafted intermediate stages.

Findings

01

Achieves state-of-the-art results on Structured3D and SceneCAD datasets.

02

Provides significantly faster inference compared to previous methods.

03

Extensible to predict semantic and architectural details.

Abstract

We address 2D floorplan reconstruction from 3D scans. Existing approaches typically employ heuristically designed multi-stage pipelines. Instead, we formulate floorplan reconstruction as a single-stage structured prediction task: find a variable-size set of polygons, which in turn are variable-length sequences of ordered vertices. To solve it we develop a novel Transformer architecture that generates polygons of multiple rooms in parallel, in a holistic manner without hand-crafted intermediate stages. The model features two-level queries for polygons and corners, and includes polygon matching to make the network end-to-end trainable. Our method achieves a new state-of-the-art for two challenging datasets, Structured3D and SceneCAD, along with significantly faster inference than previous methods. Moreover, it can readily be extended to predict additional information, i.e., semantic room…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ywyue/roomformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging · Remote Sensing and LiDAR Applications

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Softmax · Adam · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings