ComBench: A Repo-level Real-world Benchmark for Compilation Error Repair

Jia Li; Zeyang Zhuang; Zhuangbin Chen; Yuxin Su; Wei Meng; and Michael R. Lyu

arXiv:2603.27333·cs.SE·March 31, 2026

ComBench: A Repo-level Real-world Benchmark for Compilation Error Repair

Jia Li, Zeyang Zhuang, Zhuangbin Chen, Yuxin Su, Wei Meng, and Michael R. Lyu

PDF

TL;DR

ComBench is a novel, repository-level benchmark for real-world C/C++ compilation error repair, enabling more accurate evaluation of AI models' effectiveness in practical software development scenarios.

Contribution

It introduces a systematic, automated framework to create a high-quality, reproducible benchmark from GitHub projects, addressing limitations of existing single-file datasets.

Findings

01

GPT-5 achieves 73% syntactic success but only 41% semantic correctness.

02

Different models show distinct strengths for various error types.

03

ComBench enables realistic evaluation of AI-based compilation error repair methods.

Abstract

Compilation errors pose pervasive and critical challenges in software development, significantly hindering productivity. Therefore, Automated Compilation Error Repair (ACER) techniques are proposed to mitigate these issues. Despite recent advancements in ACER, its real-world performance remains poorly evaluated. This can be largely attributed to the limitations of existing benchmarks, \ie decontextualized single-file data, lack of authentic source diversity, and biased local task modeling that ignores crucial repository-level complexities. To bridge this critical gap, we propose ComBench, the first repository-level, reproducible real-world benchmark for C/C++ compilation error repair. ComBench is constructed through a novel, automated framework that systematically mines real-world failures from the GitHub CI histories of large-scale open-source projects. Our framework contributes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.