RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models
Yujin Wang, Quanfeng Liu, Jiaqi Fan, Jinlong Hong, Hongqing Chu, Mengjian Tian, Bingzhao Gao, Hong Chen

TL;DR
RAC3 is a novel framework that enhances vision-language models for autonomous driving by integrating retrieval, multimodal reasoning, and continual learning, significantly improving corner case comprehension and safety in critical scenarios.
Contribution
The paper introduces RAC3, combining a new image encoder, cross-modal training, retrieval pipeline, and reasoning strategies to improve corner case understanding in autonomous driving.
Findings
Achieves highest score of 74.46 on CODA-LM benchmark.
Demonstrates significant performance gains over prior methods.
Enhances safety and interpretability in autonomous driving systems.
Abstract
Understanding and addressing corner cases is essential for ensuring the safety and reliability of autonomous driving systems. Vision-language models (VLMs) play a crucial role in enhancing scenario comprehension, yet they face significant challenges, such as hallucination and insufficient real-world grounding, which compromise their performance in critical driving scenarios. In this work, RAC3, a novel framework designed to enhance the performance of VLMs in corner case comprehension, is proposed. RAC3 integrates a frequency-spatial fusion (FSF) image encoder, a cross-modal alignment training method for embedding models with hard and semi-hard negative mining, and a fast querying and retrieval pipeline based on K-Means clustering and hierarchical navigable small world (HNSW) indexing. A multimodal chain-of-thought (CoT) prompting strategy to guide analogical reasoning and reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Constraint Satisfaction and Optimization · Advanced Neural Network Applications
MethodsContrastive Learning
