Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
Ming Nie, Renyuan Peng, Chunwei Wang, Xinyue Cai, Jianhua Han, Hang, Xu, Li Zhang

TL;DR
Reason2Drive introduces a large-scale dataset and evaluation framework to advance interpretable, chain-based reasoning in autonomous driving, enabling better understanding and improvement of vision-language models in complex driving scenarios.
Contribution
The paper presents a new benchmark dataset with over 600K video-text pairs and a novel evaluation metric for reasoning, facilitating research in interpretable autonomous driving systems.
Findings
Existing VLMs have limited reasoning capabilities in driving tasks.
The proposed approach improves reasoning accuracy by leveraging object-level perceptual features.
Benchmark results reveal strengths and weaknesses of current models in complex driving environments.
Abstract
Large vision-language models (VLMs) have garnered increasing interest in autonomous driving areas, due to their advanced capabilities in complex reasoning tasks essential for highly autonomous vehicle behavior. Despite their potential, research in autonomous systems is hindered by the lack of datasets with annotated reasoning chains that explain the decision-making processes in driving. To bridge this gap, we present Reason2Drive, a benchmark dataset with over 600K video-text pairs, aimed at facilitating the study of interpretable reasoning in complex driving environments. We distinctly characterize the autonomous driving process as a sequential combination of perception, prediction, and reasoning steps, and the question-answer pairs are automatically collected from a diverse range of open-source outdoor driving datasets, including nuScenes, Waymo and ONCE. Moreover, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
