Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous   Driving

Feng Jiang; Chaoping Tu; Gang Zhang; Jun Li; Hanqing Huang; Junyu Lin,; Di Feng; Jian Pu

arXiv:2310.08826·cs.CV·October 16, 2023·2 cites

Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving

Feng Jiang, Chaoping Tu, Gang Zhang, Jun Li, Hanqing Huang, Junyu Lin,, Di Feng, Jian Pu

PDF

Open Access

TL;DR

This paper introduces CPGNet-LCF, a multi-modal fusion framework for 3D semantic segmentation in autonomous driving that is efficient, robust to calibration issues, and achieves state-of-the-art results on major benchmarks.

Contribution

We propose a novel fusion framework with weak calibration knowledge distillation, improving robustness and real-time performance in multi-modal 3D segmentation.

Findings

01

Achieves state-of-the-art results on nuScenes and SemanticKITTI.

02

Runs in 20ms per frame on a Tesla V100 GPU.

03

Demonstrates robustness across various calibration levels.

Abstract

LiDAR and camera are two critical sensors for multi-modal 3D semantic segmentation and are supposed to be fused efficiently and robustly to promise safety in various real-world scenarios. However, existing multi-modal methods face two key challenges: 1) difficulty with efficient deployment and real-time execution; and 2) drastic performance degradation under weak calibration between LiDAR and cameras. To address these challenges, we propose CPGNet-LCF, a new multi-modal fusion framework extending the LiDAR-only CPGNet. CPGNet-LCF solves the first challenge by inheriting the easy deployment and real-time capabilities of CPGNet. For the second challenge, we introduce a novel weak calibration knowledge distillation strategy during training to improve the robustness against the weak calibration. CPGNet-LCF achieves state-of-the-art performance on the nuScenes and SemanticKITTI benchmarks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Industrial Vision Systems and Defect Detection

MethodsKnowledge Distillation