Goedel-Code-Prover: Hierarchical Proof Search for Open State-of-the-Art Code Verification
Zenan Li, Ziran Yang, Deyuan He, Haoyu Zhao, Andrew Zhao, Shange Tang, Kaiyu Yang, Aarti Gupta, Zhendong Su, Chi Jin

TL;DR
This paper introduces Goedel-Code-Prover, a hierarchical proof search framework for automated code verification in Lean 4, achieving significant success rates and efficiency improvements over existing methods.
Contribution
It presents a novel hierarchical proof search approach with a unified policy trained via reinforcement learning, improving automation and success rates in code verification.
Findings
Achieves 62.0% prove success rate on benchmarks, 2.6× better than baselines.
Model success improves monotonically with search iterations and sampling budget.
Outperforms larger neural provers in efficiency and success rate.
Abstract
Large language models (LLMs) can generate plausible code but offer limited guarantees of correctness. Formally verifying that implementations satisfy specifications requires constructing machine-checkable proofs, a task that remains beyond current automation. We propose a hierarchical proof search framework for automated code verification in Lean~4 that decomposes complex verification goals into structurally simpler subgoals before attempting tactic-level proving. Central to our approach is a principled decomposition score that combines constructive justification with structural effectiveness. Crucially, this score serves as both the training reward and the inference-time ranking criterion, ensuring strict alignment between optimization and deployment. We train Goedel-Code-Prover-8B, a single unified policy for both decomposition and completion, via supervised initialization followed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
