MMFN: Multi-Modal-Fusion-Net for End-to-End Driving
Qingwen Zhang, Mingkai Tang, Ruoyu Geng, Feiyi Chen, Ren Xin, Lujia, Wang

TL;DR
This paper introduces MMFN, a multi-modal fusion network that effectively integrates sensor data and HD map features for improved end-to-end autonomous driving performance.
Contribution
The paper proposes a novel method for extracting and utilizing vectorized HD map features and incorporates a new expert module considering multi-road rules, enhancing driving accuracy.
Findings
Achieves superior driving performance over existing methods.
Efficiently extracts useful map features to avoid misleading information.
Enhances model with multi-road rule consideration.
Abstract
Inspired by the fact that humans use diverse sensory organs to perceive the world, sensors with different modalities are deployed in end-to-end driving to obtain the global context of the 3D scene. In previous works, camera and LiDAR inputs are fused through transformers for better driving performance. These inputs are normally further interpreted as high-level map information to assist navigation tasks. Nevertheless, extracting useful information from the complex map input is challenging, for redundant information may mislead the agent and negatively affect driving performance. We propose a novel approach to efficiently extract features from vectorized High-Definition (HD) maps and utilize them in the end-to-end driving tasks. In addition, we design a new expert to further enhance the model performance by considering multi-road rules. Experimental results prove that both of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Robotics and Sensor-Based Localization
