A Generalized Multi-Modal Fusion Detection Framework
Leichao Cui, Xiuxian Li, Min Meng, and Xiaoyu Mo

TL;DR
This paper introduces MMFusion, a flexible multi-modal 3D detection framework that fuses LiDAR and image data to improve detection accuracy in autonomous driving, especially for pedestrians and cyclists.
Contribution
The paper presents a novel, generic multi-modal fusion framework with a local perception module and a feature fusion module, compatible with various single-modal networks.
Findings
Outperforms existing benchmarks on KITTI for pedestrians and cyclists
Enhances detection accuracy through effective multi-modal feature fusion
Demonstrates strong robustness and generalization in complex scenes
Abstract
LiDAR point clouds have become the most common data source in autonomous driving. However, due to the sparsity of point clouds, accurate and reliable detection cannot be achieved in specific scenarios. Because of their complementarity with point clouds, images are getting increasing attention. Although with some success, existing fusion methods either perform hard fusion or do not fuse in a direct manner. In this paper, we propose a generic 3D detection framework called MMFusion, using multi-modal features. The framework aims to achieve accurate fusion between LiDAR and images to improve 3D detection in complex scenes. Our framework consists of two separate streams: the LiDAR stream and the camera stream, which can be compatible with any single-modal feature extraction network. The Voxel Local Perception Module in the LiDAR stream enhances local feature representation, and then the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Optical Sensing Technologies · Autonomous Vehicle Technology and Safety
