World knowledge-enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving
Mingliang Zhai, Cheng Li, Zengyuan Guo, Ningrui Yang, Xiameng Qin,, Sanyuan Zhao, Junyu Han, Ji Tao, Yuwei Wu, Yunde Jia

TL;DR
This paper introduces a novel framework for autonomous driving that enhances reasoning in perception-limited scenarios by integrating world knowledge through a plug-and-play instruction-guided interaction module and large-scale multi-modal datasets.
Contribution
It proposes a new interaction module for better perception and world knowledge integration, along with large-scale datasets for training and evaluating reasoning in autonomous driving.
Findings
Improved reasoning performance in perception-limited conditions.
Effective reduction of input sequence length for multi-view video processing.
Validated through extensive experiments demonstrating enhanced autonomous driving reasoning.
Abstract
The Multi-modal Large Language Models (MLLMs) with extensive world knowledge have revitalized autonomous driving, particularly in reasoning tasks within perceivable regions. However, when faced with perception-limited areas (dynamic or static occlusion regions), MLLMs struggle to effectively integrate perception ability with world knowledge for reasoning. These perception-limited regions can conceal crucial safety information, especially for vulnerable road users. In this paper, we propose a framework, which aims to improve autonomous driving performance under perceptionlimited conditions by enhancing the integration of perception capabilities and world knowledge. Specifically, we propose a plug-and-play instruction-guided interaction module that bridges modality gaps and significantly reduces the input sequence length, allowing it to adapt effectively to multi-view video inputs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobotics and Automated Systems · Robotic Path Planning Algorithms · Intelligent Tutoring Systems and Adaptive Learning
