From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios

Yuan Gao; Mattia Piccinini; Korbinian Moller; Amr Alanwar; Johannes Betz

arXiv:2502.02145·cs.AI·July 21, 2025

From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios

Yuan Gao, Mattia Piccinini, Korbinian Moller, Amr Alanwar, Johannes Betz

PDF

Open Access 2 Repos

TL;DR

This paper presents a novel approach combining Large Language Models with structured prompts to automatically evaluate and generate safety-critical driving scenarios, reducing reliance on manual scenario design and improving testing scalability for autonomous vehicles.

Contribution

The authors introduce LLM-guided evaluation and adversarial scenario generation techniques specifically tailored for autonomous vehicle safety testing, with new prompt strategies and an adversarial module.

Findings

01

Evaluation module accurately detects collision scenarios.

02

Generation module synthesizes realistic, high-risk scenarios.

03

Approach reduces dependence on handcrafted safety metrics.

Abstract

Ensuring the safety of autonomous vehicles requires virtual scenario-based testing, which depends on the robust evaluation and generation of safety-critical scenarios. So far, researchers have used scenario-based testing frameworks that rely heavily on handcrafted scenarios as safety metrics. To reduce the effort of human interpretation and overcome the limited scalability of these approaches, we combine Large Language Models (LLMs) with structured scenario parsing and prompt engineering to automatically evaluate and generate safety-critical driving scenarios. We introduce Cartesian and Ego-centric prompt strategies for scenario evaluation, and an adversarial generation module that modifies trajectories of risk-inducing vehicles (ego-attackers) to create critical scenarios. We validate our approach using a 2D simulation framework and multiple pre-trained LLMs. The results show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques