Large Language Models in Code Co-generation for Safe Autonomous Vehicles

Ali Nouri; Beatriz Cabrero-Daniel; Zhennan Fei; Krishna Ronanki; H{\aa}kan Sivencrona; and Christian Berger

arXiv:2505.19658·cs.SE·May 27, 2025

Large Language Models in Code Co-generation for Safe Autonomous Vehicles

Ali Nouri, Beatriz Cabrero-Daniel, Zhennan Fei, Krishna Ronanki, H{\aa}kan Sivencrona, and Christian Berger

PDF

Open Access

TL;DR

This paper evaluates the safety and reliability of large language models in generating code for autonomous vehicle systems, proposing an assessment pipeline to aid safety reviews and analyzing common faults.

Contribution

It introduces a systematic evaluation pipeline for LLM-generated code in safety-critical automotive applications and provides a detailed fault analysis to improve review processes.

Findings

01

Six LLMs tested on safety-related tasks

02

Identified common fault types in LLM-generated code

03

Discussion on LLM limitations and capabilities

Abstract

Software engineers in various industrial domains are already using Large Language Models (LLMs) to accelerate the process of implementing parts of software systems. When considering its potential use for ADAS or AD systems in the automotive context, there is a need to systematically assess this new setup: LLMs entail a well-documented set of risks for safety-related systems' development due to their stochastic nature. To reduce the effort for code reviewers to evaluate LLM-generated code, we propose an evaluation pipeline to conduct sanity-checks on the generated code. We compare the performance of six state-of-the-art LLMs (CodeLlama, CodeGemma, DeepSeek-r1, DeepSeek-Coders, Mistral, and GPT-4) on four safety-related programming tasks. Additionally, we qualitatively analyse the most frequent faults generated by these LLMs, creating a failure-mode catalogue to support human reviewers.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Testing and Debugging Techniques

MethodsSparse Evolutionary Training