Robust Double-Encoder Network for RGB-D Panoptic Segmentation
Matteo Sodano, Federico Magistri, Tiziano Guadagnino, Jens Behley,, Cyrill Stachniss

TL;DR
This paper introduces a robust double-encoder neural network for RGB-D panoptic segmentation that effectively combines RGB and depth data, improving scene understanding in indoor environments with flexibility to handle missing cues.
Contribution
The novel ResidualExcite merging approach and the double-encoder architecture enable robust, flexible panoptic segmentation across RGB, depth, or combined inputs without retraining.
Findings
Achieves superior results on public datasets.
Robust to missing RGB or depth cues.
Flexible to different input modalities.
Abstract
Perception is crucial for robots that act in real-world environments, as autonomous systems need to see and understand the world around them to act properly. Panoptic segmentation provides an interpretation of the scene by computing a pixelwise semantic label together with instance IDs. In this paper, we address panoptic segmentation using RGB-D data of indoor scenes. We propose a novel encoder-decoder neural network that processes RGB and depth separately through two encoders. The features of the individual encoders are progressively merged at different resolutions, such that the RGB features are enhanced using complementary depth information. We propose a novel merging approach called ResidualExcite, which reweighs each entry of the feature map according to its importance. With our double-encoder architecture, we are robust to missing cues. In particular, the same model can train and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Advanced Vision and Imaging
