All Artificial, Less Intelligence: GenAI through the Lens of Formal   Verification

Deepak Narayan Gadde; Aman Kumar; Thomas Nalapat; Evgenii Rezunov and; Fabio Cappellini

arXiv:2403.16750·cs.AI·September 4, 2024·1 cites

All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification

Deepak Narayan Gadde, Aman Kumar, Thomas Nalapat, Evgenii Rezunov and, Fabio Cappellini

PDF

Open Access

TL;DR

This paper investigates the vulnerability of hardware designs generated by Large Language Models to Common Weakness Enumerations (CWEs) using formal verification, revealing that most LLM-generated designs are prone to security and safety issues.

Contribution

It introduces a dataset of 60,000 SystemVerilog designs generated by LLMs, annotated for CWE vulnerabilities, and highlights the lack of CWE awareness in LLMs for hardware design.

Findings

01

Approximately 60% of LLM-generated designs are CWE-prone.

02

Most LLMs are unaware of hardware CWEs during code generation.

03

The dataset can be used to train models to avoid CWE vulnerabilities.

Abstract

Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each hardware design as vulnerable or CWE-free. This dataset was generated by 4 different LLMs and features a unique set of designs for each of the 10 CWEs we target in our paper. We have associated the identified vulnerabilities with CWE numbers for a dataset of 60,000 generated SystemVerilog Register Transfer Level (RTL) code. It was also found that most LLMs are not aware of any hardware CWEs; hence they are usually not considered when generating the hardware code. Our study reveals that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Image Processing and 3D Reconstruction · Robotics and Automated Systems

MethodsSparse Evolutionary Training · Attentive Walk-Aggregating Graph Neural Network