AD^2-Bench: A Hierarchical CoT Benchmark for MLLM in Autonomous Driving under Adverse Conditions
Zhaoyang Wei, Chenhui Qiang, Bowen Jiang, Xumeng Han, Xuehui Yu, Zhenjun Han

TL;DR
AD^2-Bench is a new hierarchical Chain-of-Thought benchmark designed to evaluate multi-modal large models' reasoning in autonomous driving under adverse weather and complex scenes, highlighting current models' limitations.
Contribution
It introduces the first comprehensive CoT benchmark for autonomous driving in challenging conditions, with extensive annotations and an evaluation framework for fine-grained reasoning analysis.
Findings
State-of-the-art MLLMs achieve below 60% accuracy on AD^2-Bench.
The benchmark reveals significant challenges in current models' reasoning capabilities.
AD^2-Bench facilitates targeted improvements in autonomous driving AI systems.
Abstract
Chain-of-Thought (CoT) reasoning has emerged as a powerful approach to enhance the structured, multi-step decision-making capabilities of Multi-Modal Large Models (MLLMs), is particularly crucial for autonomous driving with adverse weather conditions and complex traffic environments. However, existing benchmarks have largely overlooked the need for rigorous evaluation of CoT processes in these specific and challenging scenarios. To address this critical gap, we introduce AD^2-Bench, the first Chain-of-Thought benchmark specifically designed for autonomous driving with adverse weather and complex scenes. AD^2-Bench is meticulously constructed to fulfill three key criteria: comprehensive data coverage across diverse adverse environments, fine-grained annotations that support multi-step reasoning, and a dedicated evaluation framework tailored for assessing CoT performance. The core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Advanced Neural Network Applications
