Reality Bites: Assessing the Realism of Driving Scenarios with Large   Language Models

Jiahui Wu; Chengjie Lu; Aitor Arrieta; Tao Yue; Shaukat Ali

arXiv:2403.09906·cs.SE·March 18, 2024·1 cites

Reality Bites: Assessing the Realism of Driving Scenarios with Large Language Models

Jiahui Wu, Chengjie Lu, Aitor Arrieta, Tao Yue, Shaukat Ali

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the ability of large language models to assess the realism of driving scenarios, demonstrating that GPT outperforms other models in robustness across various conditions, which is vital for autonomous driving testing.

Contribution

It introduces an empirical evaluation framework for assessing LLMs' effectiveness in judging driving scenario realism, highlighting GPT's superior robustness.

Findings

01

GPT achieved highest robustness across scenarios

02

Weather and road conditions affect LLM performance

03

Mistral performed the worst in assessments

Abstract

Large Language Models (LLMs) are demonstrating outstanding potential for tasks such as text generation, summarization, and classification. Given that such models are trained on a humongous amount of online knowledge, we hypothesize that LLMs can assess whether driving scenarios generated by autonomous driving testing techniques are realistic, i.e., being aligned with real-world driving conditions. To test this hypothesis, we conducted an empirical evaluation to assess whether LLMs are effective and robust in performing the task. This reality check is an important step towards devising LLM-based autonomous driving testing techniques. For our empirical evaluation, we selected 64 realistic scenarios from \deepscenario--an open driving scenario dataset. Next, by introducing minor changes to them, we created 512 additional realistic scenarios, to form an overall dataset of 576 scenarios.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

simula-complex/realitybites
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods