SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation

Hao Ye; Mengshi Qi; Zhaohong Liu; Liang Liu; Huadong Ma

arXiv:2507.21585·cs.AI·July 30, 2025

SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval-Augmented Generation

Hao Ye, Mengshi Qi, Zhaohong Liu, Liang Liu, Huadong Ma

PDF

TL;DR

This paper introduces SafeDrive228K, a large-scale benchmark for evaluating vision-language models in traffic safety scenarios, and proposes a knowledge graph-based retrieval-augmented generation method to improve model reliability in autonomous driving tasks.

Contribution

The paper presents the first large-scale multimodal benchmark for traffic safety questions and a novel retrieval-augmented generation approach using knowledge graphs for autonomous driving safety.

Findings

01

Retrieval-augmented models outperform baseline VLMs in safety tasks.

02

Significant performance improvements in traffic accidents and corner cases.

03

Enhanced safety reasoning capabilities demonstrated across five mainstream VLMs.

Abstract

In this work, we study how vision-language models (VLMs) can be utilized to enhance the safety for the autonomous driving system, including perception, situational understanding, and path planning. However, existing research has largely overlooked the evaluation of these models in traffic safety-critical driving scenarios. To bridge this gap, we create the benchmark (SafeDrive228K) and propose a new baseline based on VLM with knowledge graph-based retrieval-augmented generation (SafeDriveRAG) for visual question answering (VQA). Specifically, we introduce SafeDrive228K, the first large-scale multimodal question-answering benchmark comprising 228K examples across 18 sub-tasks. This benchmark encompasses a diverse range of traffic safety queries, from traffic accidents and corner cases to common safety knowledge, enabling a thorough assessment of the comprehension and reasoning abilities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.