Automated Repair of AI Code with Large Language Models and Formal   Verification

Yiannis Charalambous; Edoardo Manino; Lucas C. Cordeiro

arXiv:2405.08848·cs.SE·May 16, 2024

Automated Repair of AI Code with Large Language Models and Formal Verification

Yiannis Charalambous, Edoardo Manino, Lucas C. Cordeiro

PDF

Open Access

TL;DR

This paper presents an automated approach combining large language models and formal verification to detect and repair memory safety vulnerabilities in neural network code, enhancing AI system safety.

Contribution

It introduces a large-scale neural network code dataset, integrates formal verification with language models for automated repair, and evaluates different prompt engineering techniques.

Findings

01

Expanded NeuroCodeBench to 81k programs for testing.

02

Successfully verified memory safety and repaired vulnerabilities automatically.

03

Compared effectiveness of various prompt engineering methods.

Abstract

The next generation of AI systems requires strong safety guarantees. This report looks at the software implementation of neural networks and related memory safety properties, including NULL pointer deference, out-of-bound access, double-free, and memory leaks. Our goal is to detect these vulnerabilities, and automatically repair them with the help of large language models. To this end, we first expand the size of NeuroCodeBench, an existing dataset of neural network code, to about 81k programs via an automated process of program mutation. Then, we verify the memory safety of the mutated neural network implementations with ESBMC, a state-of-the-art software verifier. Whenever ESBMC spots a vulnerability, we invoke a large language model to repair the source code. For the latest task, we compare the performance of various state-of-the-art prompt engineering techniques, and an iterative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Software Engineering Research · Machine Learning and Data Classification