Improving Compiler Bug Isolation by Leveraging Large Language Models

Yixian Qi; Jiajun Jiang; Fengjie Li; Bowen Chen; Hongyu Zhang; Junjie Chen

arXiv:2506.17647·cs.SE·June 24, 2025

Improving Compiler Bug Isolation by Leveraging Large Language Models

Yixian Qi, Jiajun Jiang, Fengjie Li, Bowen Chen, Hongyu Zhang, Junjie Chen

PDF

TL;DR

This paper introduces AutoCBI, a novel bug localization method leveraging large language models to improve compiler bug isolation by summarizing functions and guiding suspicious file reordering, outperforming existing approaches.

Contribution

The paper presents AutoCBI, the first approach to use LLMs for compiler bug localization, enhancing accuracy and efficiency over prior methods.

Findings

01

AutoCBI isolates significantly more bugs in top ranks than state-of-the-art methods.

02

Using LLMs improves bug localization effectiveness in large compiler codebases.

03

Component ablation confirms each part's importance in AutoCBI's success.

Abstract

Compilers play a foundational role in building reliable software systems, and bugs within them can lead to catastrophic consequences. The compilation process typically involves hundreds of files, making traditional automated bug isolation techniques inapplicable due to scalability or effectiveness issues. Current mainstream compiler bug localization techniques have limitations in test program mutation and resource consumption. Inspired by the recent advances of pre-trained Large Language Models (LLMs), we propose an innovative approach named AutoCBI, which (1) uses LLMs to summarize compiler file functions and (2) employs specialized prompts to guide LLM in reordering suspicious file rankings. This approach leverages four types of information: the failing test program, source file function summaries, lists of suspicious files identified through analyzing test coverage, as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.