Large Language Models in Code Co-generation for Safe Autonomous Vehicles
By: Ali Nouri , Beatriz Cabrero-Daniel , Zhennan Fei and more
Potential Business Impact:
Tests computer code for safety in cars.
Software engineers in various industrial domains are already using Large Language Models (LLMs) to accelerate the process of implementing parts of software systems. When considering its potential use for ADAS or AD systems in the automotive context, there is a need to systematically assess this new setup: LLMs entail a well-documented set of risks for safety-related systems' development due to their stochastic nature. To reduce the effort for code reviewers to evaluate LLM-generated code, we propose an evaluation pipeline to conduct sanity-checks on the generated code. We compare the performance of six state-of-the-art LLMs (CodeLlama, CodeGemma, DeepSeek-r1, DeepSeek-Coders, Mistral, and GPT-4) on four safety-related programming tasks. Additionally, we qualitatively analyse the most frequent faults generated by these LLMs, creating a failure-mode catalogue to support human reviewers. Finally, the limitations and capabilities of LLMs in code generation, and the use of the proposed pipeline in the existing process, are discussed.
Similar Papers
Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems
Software Engineering
Makes car software safer and faster to build.
On Simulation-Guided LLM-based Code Generation for Safe Autonomous Driving Software
Software Engineering
Makes self-driving car code faster and safer.
Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications
Software Engineering
Lets anyone write computer programs with plain English.