SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation
Hao Ye, Mengshi Qi, Zhaohong Liu, Liang Liu, Huadong Ma

TL;DR
This paper introduces SafeDrive228K, a large-scale benchmark for evaluating vision-language models in traffic safety scenarios, and proposes a knowledge graph-based retrieval-augmented generation method to improve model reliability in autonomous driving tasks.
Contribution
The paper presents the first large-scale multimodal benchmark for traffic safety questions and a novel retrieval-augmented generation approach using knowledge graphs for autonomous driving safety.
Findings
Retrieval-augmented models outperform baseline VLMs in safety tasks.
Significant performance improvements in traffic accidents and corner cases.
Enhanced safety reasoning capabilities demonstrated across five mainstream VLMs.
Abstract
In this work, we study how vision-language models (VLMs) can be utilized to enhance the safety for the autonomous driving system, including perception, situational understanding, and path planning. However, existing research has largely overlooked the evaluation of these models in traffic safety-critical driving scenarios. To bridge this gap, we create the benchmark (SafeDrive228K) and propose a new baseline based on VLM with knowledge graph-based retrieval-augmented generation (SafeDriveRAG) for visual question answering (VQA). Specifically, we introduce SafeDrive228K, the first large-scale multimodal question-answering benchmark comprising 228K examples across 18 sub-tasks. This benchmark encompasses a diverse range of traffic safety queries, from traffic accidents and corner cases to common safety knowledge, enabling a thorough assessment of the comprehension and reasoning abilities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
