Interleaving Large Language Models for Compiler Testing
Yunbo Ni, Shaohua Li

TL;DR
This paper introduces LegoFuzz, a novel framework that enhances compiler testing by combining offline generation of feature-rich code snippets with online strategic assembly, leading to improved bug detection in C compilers.
Contribution
The paper presents a new two-phase testing framework that efficiently leverages LLMs for generating and combining code snippets, significantly improving bug detection in compiler testing.
Findings
Discovered 66 bugs in GCC and LLVM.
Nearly half of the bugs are miscompilation bugs.
The approach outperforms existing LLM-based testing tools.
Abstract
Testing compilers with AI models, especially large language models (LLMs), has shown great promise. However, current approaches struggle with two key problems: The generated programs for testing compilers are often too simple, and extensive testing with the LLMs is computationally expensive. In this paper, we propose a novel compiler testing framework that decouples the testing process into two distinct phases: an offline phase and an online phase. In the offline phase, we use LLMs to generate a collection of small but feature-rich code pieces. In the online phase, we reuse these code pieces by strategically combining them to build high-quality and valid test programs, which are then used to test compilers. We implement this idea in a tool, LegoFuzz, for testing C compilers. The results are striking: we found 66 bugs in GCC and LLVM, the most widely used C compilers. Almost half of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
