# Deep Multi-modal Object Detection and Semantic Segmentation for   Autonomous Driving: Datasets, Methods, and Challenges

**Authors:** Di Feng, Christian Haase-Sch\"utz, Lars Rosenbaum, Heinz Hertlein,, Claudius Glaeser, Fabian Timm, Werner Wiesbeck, Klaus Dietmayer

arXiv: 1902.07830 · 2020-07-20

## TL;DR

This paper reviews deep multi-modal perception techniques for autonomous driving, discussing sensor fusion methods, datasets, challenges, and open questions to guide future research in robust scene understanding.

## Contribution

It systematically summarizes existing methodologies, discusses key challenges, and provides a comprehensive overview of multi-modal perception for autonomous vehicles.

## Key findings

- Summarizes various sensor fusion strategies and their applications.
- Highlights open challenges and questions in multi-modal perception.
- Provides a curated dataset overview and an interactive reference platform.

## Abstract

Recent advancements in perception for autonomous driving are driven by deep learning. In order to achieve robust and accurate scene understanding, autonomous vehicles are usually equipped with different sensors (e.g. cameras, LiDARs, Radars), and multiple sensing modalities can be fused to exploit their complementary properties. In this context, many methods have been proposed for deep multi-modal perception problems. However, there is no general guideline for network architecture design, and questions of "what to fuse", "when to fuse", and "how to fuse" remain open. This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving. To this end, we first provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection and semantic segmentation in autonomous driving research. We then summarize the fusion methodologies and discuss challenges and open questions. In the appendix, we provide tables that summarize topics and methods. We also provide an interactive online platform to navigate each reference: https://boschresearch.github.io/multimodalperception/.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.07830/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/1902.07830/full.md

## References

253 references — full list in the complete paper: https://tomesphere.com/paper/1902.07830/full.md

---
Source: https://tomesphere.com/paper/1902.07830