Making Large Language Models Better Planners with Reasoning-Decision Alignment
Zhijian Huang, Tao Tang, Shaoxiang Chen, Sihao Lin, Zequn Jie, Lin Ma,, Guangrun Wang, Xiaodan Liang

TL;DR
This paper introduces RDA-Driver, a multimodal LLM-based autonomous driving model that aligns reasoning and decision-making, improving interpretability and performance in complex traffic scenarios.
Contribution
It proposes a novel end-to-end decision-making framework with reasoning-decision alignment and redesigned CoTs for better scene understanding and planning.
Findings
Achieves state-of-the-art planning performance on nuScenes with 0.80 L2 error.
Demonstrates leading results on DriveLM-nuScenes with 0.82 L2 error.
Enhances explainability and decision accuracy in autonomous driving.
Abstract
Data-driven approaches for autonomous driving (AD) have been widely adopted in the past decade but are confronted with dataset bias and uninterpretability. Inspired by the knowledge-driven nature of human driving, recent approaches explore the potential of large language models (LLMs) to improve understanding and decision-making in traffic scenarios. They find that the pretrain-finetune paradigm of LLMs on downstream data with the Chain-of-Thought (CoT) reasoning process can enhance explainability and scene understanding. However, such a popular strategy proves to suffer from the notorious problems of misalignment between the crafted CoTs against the consequent decision-making, which remains untouched by previous LLM-based AD methods. To address this problem, we motivate an end-to-end decision-making model based on multimodality-augmented LLM, which simultaneously executes CoT reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies
