WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model
Songyan Zhang, Wenhui Huang, Zihui Gao, Hao Chen, Chen Lv

TL;DR
WiseAD is a vision-language model designed for autonomous driving that integrates extensive driving knowledge to improve safety, decision-making, and performance in diverse scenarios, achieving state-of-the-art results.
Contribution
The paper introduces WiseAD, a novel VLM for end-to-end autonomous driving that effectively incorporates and leverages broad driving knowledge for improved performance.
Findings
Significant reduction in critical accidents with increased knowledge diversity.
11.9% improvement in driving score on Carla evaluations.
Demonstrates strong knowledge understanding in both in-domain and out-of-domain datasets.
Abstract
The emergence of general human knowledge and impressive logical reasoning capacity in rapidly progressed vision-language models (VLMs) have driven increasing interest in applying VLMs to high-level autonomous driving tasks, such as scene understanding and decision-making. However, an in-depth study on the relationship between knowledge proficiency, especially essential driving expertise, and closed-loop autonomous driving performance requires further exploration. In this paper, we investigate the effects of the depth and breadth of fundamental driving knowledge on closed-loop trajectory planning and introduce WiseAD, a specialized VLM tailored for end-to-end autonomous driving capable of driving reasoning, action justification, object recognition, risk analysis, driving suggestions, and trajectory planning across diverse scenarios. We employ joint training on driving knowledge and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Autonomous Vehicle Technology and Safety
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
