Scale Invariant Semantic Segmentation with RGB-D Fusion
Mohammad Dawud Ansari, Alwi Husada, Didier Stricker

TL;DR
This paper introduces a scale-invariant semantic segmentation neural network that fuses RGB and depth data, improving segmentation accuracy across varying object scales in outdoor scenes, validated on Cityscapes and synthetic datasets.
Contribution
It presents a novel fusion block for combining RGB and depth features in a scale-invariant segmentation model based on DeepLab-v2, enhancing performance on outdoor scenes.
Findings
Comparable to state-of-the-art on Cityscapes
Effective on real and synthetic datasets
Improves segmentation of multi-scale objects
Abstract
In this paper, we propose a neural network architecture for scale-invariant semantic segmentation using RGB-D images. We utilize depth information as an additional modality apart from color images only. Especially in an outdoor scene which consists of different scale objects due to the distance of the objects from the camera. The near distance objects consist of significantly more pixels than the far ones. We propose to incorporate depth information to the RGB data for pixel-wise semantic segmentation to address the different scale objects in an outdoor scene. We adapt to a well-known DeepLab-v2(ResNet-101) model as our RGB baseline. Depth images are passed separately as an additional input with a distinct branch. The intermediate feature maps of both color and depth image branch are fused using a novel fusion block. Our model is compact and can be easily applied to the other RGB model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
