Interleaving Large Language Models for Compiler Testing

Yunbo Ni; Shaohua Li

arXiv:2508.18955·cs.SE·August 27, 2025

Interleaving Large Language Models for Compiler Testing

Yunbo Ni, Shaohua Li

PDF

TL;DR

This paper introduces LegoFuzz, a novel framework that enhances compiler testing by combining offline generation of feature-rich code snippets with online strategic assembly, leading to improved bug detection in C compilers.

Contribution

The paper presents a new two-phase testing framework that efficiently leverages LLMs for generating and combining code snippets, significantly improving bug detection in compiler testing.

Findings

01

Discovered 66 bugs in GCC and LLVM.

02

Nearly half of the bugs are miscompilation bugs.

03

The approach outperforms existing LLM-based testing tools.

Abstract

Testing compilers with AI models, especially large language models (LLMs), has shown great promise. However, current approaches struggle with two key problems: The generated programs for testing compilers are often too simple, and extensive testing with the LLMs is computationally expensive. In this paper, we propose a novel compiler testing framework that decouples the testing process into two distinct phases: an offline phase and an online phase. In the offline phase, we use LLMs to generate a collection of small but feature-rich code pieces. In the online phase, we reuse these code pieces by strategically combining them to build high-quality and valid test programs, which are then used to test compilers. We implement this idea in a tool, LegoFuzz, for testing C compilers. The results are striking: we found 66 bugs in GCC and LLVM, the most widely used C compilers. Almost half of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.