Machine-Generated, Machine-Checked Proofs for a Verified Compiler (Experience Report)
Zoe Paraskevopoulou

TL;DR
This paper demonstrates how an AI-powered coding assistant can automate the proof of a key compiler correctness property, significantly reducing human effort and time in formal verification tasks.
Contribution
It presents a novel approach of using a large language model to generate and adapt complex formal proofs for compiler transformations, showcasing automation in formal verification.
Findings
Automated proof development took approximately 96 hours.
The AI-assisted proof was larger but comparable in complexity to human-made proofs.
The approach highlights potential for reducing human effort in formal verification.
Abstract
We report on using an agentic coding assistant (Claude Code, powered by Claude Opus 4.6) to mechanize a substantial Rocq correctness proof from scratch, with human guidance but without human proof writing. The proof establishes semantic preservation for the administrative normal form (ANF) transformation in the CertiCoq verified compiler for Rocq. The closely related continuation-passing style (CPS) transformation in CertiCoq was previously proved correct by human experts over several months. We use this proof as a template and instruct the LLM to adapt the proof technique to the ANF setting, which differs in important technical ways. The resulting ANF proof comprises approximately 7,800 lines of Rocq (larger than the 5,300-line CPS proof) and was developed in approximately 96 hours. We describe the proof technique and report on the experience of developing it with an LLM, discussing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Security and Verification in Computing · Formal Methods in Verification
