WybeCoder: Verified Imperative Code Generation
Fabian Gloeckle, Mantas Baksys, Darius Feher, Kunhao Zheng, Amaury Hayat, Sean B. Holden, Gabriel Synnaeve, Peter O'Hearn

TL;DR
WybeCoder is a novel framework that enables prove-as-you-generate development, co-evolving code, invariants, and proofs, significantly advancing automated verification of imperative code using LLMs and formal methods.
Contribution
It introduces WybeCoder, integrating automatic verification with interactive proofs, and demonstrates scalable, improved verification performance on complex algorithms and benchmarks.
Findings
Solved 74% of Verina tasks and 62% of Clever tasks.
Synthesized dozens of invariants and dispatched numerous subgoals.
Produced hundreds of lines of verified code, surpassing previous benchmarks.
Abstract
Recent progress in large language models (LLMs) has substantially advanced automatic code generation and formal theorem proving, yet software verification has not seen comparable gains. To address this gap, we propose WybeCoder, an agentic code verification framework that enables prove-as-you-generate development, in which code, invariants, and proofs co-evolve. WybeCoder builds on a recent framework that combines automatic verification condition generation and SMT solving with interactive proofs in Lean. To enable systematic evaluation, we translate two benchmarks for functional verification in Lean, Verina and Clever, into equivalent imperative code specifications. On complex algorithms such as Heapsort, we observe consistent performance improvements as we scale our approach, synthesizing dozens of valid invariants and dispatching dozens of subgoals, ultimately producing hundreds of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
