A Real-Time Online Learning Framework for Joint 3D Reconstruction and   Semantic Segmentation of Indoor Scenes

Davide Menini; Suryansh Kumar; Martin R. Oswald; Erik Sandstrom,; Cristian Sminchisescu; Luc Van Gool

arXiv:2108.05246·cs.CV·December 30, 2021

A Real-Time Online Learning Framework for Joint 3D Reconstruction and Semantic Segmentation of Indoor Scenes

Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstrom,, Cristian Sminchisescu, Luc Van Gool

PDF

Open Access 1 Repo

TL;DR

This paper introduces a real-time online framework that jointly reconstructs 3D indoor scenes and performs semantic segmentation using deep learning, improving accuracy and efficiency in noisy, real-world conditions.

Contribution

It proposes a novel deep neural network with vortex pooling for online depth and semantic fusion, eliminating routing networks to enhance detail preservation and noise resistance.

Findings

01

Achieves 37 and 10 fps in depth fusion with high accuracy

02

Attains 88% and 91% reconstruction F-score on the Replica dataset

03

Secures an average IoU of 0.515 on ScanNet benchmark

Abstract

This paper presents a real-time online vision framework to jointly recover an indoor scene's 3D structure and semantic label. Given noisy depth maps, a camera trajectory, and 2D semantic labels at train time, the proposed deep neural network based approach learns to fuse the depth over frames with suitable semantic labels in the scene space. Our approach exploits the joint volumetric representation of the depth and semantics in the scene feature space to solve this task. For a compelling online fusion of the semantic labels and geometry in real-time, we introduce an efficient vortex pooling block while dropping the use of routing network in online depth fusion to preserve high-frequency surface details. We show that the context information provided by the semantics of the scene helps the depth fusion network learn noise-resistant features. Not only that, it helps overcome the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

suryanshkumar/online-joint-depthfusion-and-semantic
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Remote Sensing and LiDAR Applications