Large Language Models in Code Co-generation for Safe Autonomous Vehicles
Ali Nouri, Beatriz Cabrero-Daniel, Zhennan Fei, Krishna Ronanki, H{\aa}kan Sivencrona, and Christian Berger

TL;DR
This paper evaluates the safety and reliability of large language models in generating code for autonomous vehicle systems, proposing an assessment pipeline to aid safety reviews and analyzing common faults.
Contribution
It introduces a systematic evaluation pipeline for LLM-generated code in safety-critical automotive applications and provides a detailed fault analysis to improve review processes.
Findings
Six LLMs tested on safety-related tasks
Identified common fault types in LLM-generated code
Discussion on LLM limitations and capabilities
Abstract
Software engineers in various industrial domains are already using Large Language Models (LLMs) to accelerate the process of implementing parts of software systems. When considering its potential use for ADAS or AD systems in the automotive context, there is a need to systematically assess this new setup: LLMs entail a well-documented set of risks for safety-related systems' development due to their stochastic nature. To reduce the effort for code reviewers to evaluate LLM-generated code, we propose an evaluation pipeline to conduct sanity-checks on the generated code. We compare the performance of six state-of-the-art LLMs (CodeLlama, CodeGemma, DeepSeek-r1, DeepSeek-Coders, Mistral, and GPT-4) on four safety-related programming tasks. Additionally, we qualitatively analyse the most frequent faults generated by these LLMs, creating a failure-mode catalogue to support human reviewers.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Testing and Debugging Techniques
MethodsSparse Evolutionary Training
