Estimating Generic 3D Room Structures from 2D Annotations
Denys Rozumnyi, Stefan Popov, Kevis-Kokitsi Maninis, Matthias, Nie{\ss}ner, Vittorio Ferrari

TL;DR
This paper introduces a novel method to estimate 3D room layouts from simple 2D segmentation masks, enabling easier annotation and automatic reconstruction of room structures from RGB videos.
Contribution
It presents a new approach that derives 3D room layouts from 2D annotations, reducing annotation difficulty and providing a large dataset of 3D room layouts.
Findings
High-quality 3D room layout annotations achieved
Automatic reconstruction from 2D masks is effective
Public release of 2246 annotated 3D room layouts
Abstract
Indoor rooms are among the most common use cases in 3D scene understanding. Current state-of-the-art methods for this task are driven by large annotated datasets. Room layouts are especially important, consisting of structural elements in 3D, such as wall, floor, and ceiling. However, they are difficult to annotate, especially on pure RGB video. We propose a novel method to produce generic 3D room layouts just from 2D segmentation masks, which are easy to annotate for humans. Based on these 2D annotations, we automatically reconstruct 3D plane equations for the structural elements and their spatial extent in the scene, and connect adjacent elements at the appropriate contact edges. We annotate and publicly release 2246 3D room layouts on the RealEstate10k dataset, containing YouTube videos. We demonstrate the high quality of these 3D layouts annotations with extensive experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · 3D Surveying and Cultural Heritage
