Delivering Arbitrary-Modal Semantic Segmentation

Jiaming Zhang; Ruiping Liu; Hao Shi; Kailun Yang; Simon Rei{\ss},; Kunyu Peng; Haodong Fu; Kaiwei Wang; Rainer Stiefelhagen

arXiv:2303.01480·cs.CV·March 3, 2023·5 cites

Delivering Arbitrary-Modal Semantic Segmentation

Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Rei{\ss},, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new benchmark and a flexible model for semantic segmentation that effectively fuses an arbitrary number of modalities, improving robustness especially under challenging weather and sensor failure conditions.

Contribution

The paper presents the DeLiVER benchmark for arbitrary-modal segmentation and the CMNeXt model, enabling scalable fusion of multiple modalities with minimal additional parameters.

Findings

01

CMNeXt achieves state-of-the-art results on six benchmarks.

02

DeLiVER dataset includes severe weather and sensor failure scenarios.

03

Quad-modal CMNeXt improves mIoU by 9.10% over mono-modal baseline.

Abstract

Multimodal fusion can make semantic segmentation more robust. However, fusing an arbitrary number of modalities remains underexplored. To delve into this problem, we create the DeLiVER arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. Aside from this, we provide this dataset in four severe weather conditions as well as five sensor failure cases to exploit modal complementarity and resolve partial outages. To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt. It encompasses a Self-Query Hub (SQ-Hub) designed to extract effective information from any modality for subsequent fusion with the RGB representation and adds only negligible amounts of parameters (~0.01M) per additional modality. On top, to efficiently and flexibly harvest discriminative cues from the auxiliary modalities, we introduce the simple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jamycheung/DELIVER
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning