Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Daniel Seichter, S\"ohnke Benedikt Fischedick, Mona K\"ohler,, Horst-Michael Gro{\ss}

TL;DR
This paper introduces EMSANet, an efficient multi-task neural network that performs semantic segmentation, instance segmentation, orientation estimation, and scene classification simultaneously on RGB-D data, enabling real-time indoor scene analysis on mobile devices.
Contribution
The paper presents EMSANet, the first multi-task model for comprehensive indoor scene analysis on RGB-D data that operates in real time on mobile platforms.
Findings
All tasks are achieved with a single neural network in real time.
Multi-task learning improves performance across tasks.
Extended annotations enable comprehensive evaluation.
Abstract
Semantic scene understanding is essential for mobile agents acting in various environments. Although semantic segmentation already provides a lot of information, details about individual objects as well as the general scene are missing but required for many real-world applications. However, solving multiple tasks separately is expensive and cannot be accomplished in real time given limited computing and battery capabilities on a mobile platform. In this paper, we propose an efficient multi-task approach for RGB-D scene analysis~(EMSANet) that simultaneously performs semantic and instance segmentation~(panoptic segmentation), instance orientation estimation, and scene classification. We show that all tasks can be accomplished using a single neural network in real time on a mobile platform without diminishing performance - by contrast, the individual tasks are able to benefit from each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
