Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification

Saketh Ram Kasibatla; Arpan Agarwal; Yuriy Brun; Sorin Lerner; Talia Ringer; Emily First

arXiv:2410.19940·cs.LO·April 8, 2026·2 cites

Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification

Saketh Ram Kasibatla, Arpan Agarwal, Yuriy Brun, Sorin Lerner, Talia Ringer, Emily First

PDF

TL;DR

Cobblestone is a divide-and-conquer method leveraging large language models to automate proof synthesis in formal verification, outperforming existing tools on multiple benchmarks.

Contribution

Introduces Cobblestone, a novel divide-and-conquer approach that uses LLMs for proof synthesis, improving automation and success rates in formal verification tasks.

Findings

01

Outperforms state-of-the-art non-LLM tools on four benchmarks.

02

Proves many theorems that other LLM-based tools cannot.

03

Proves up to 58% of theorems with external input.

Abstract

Formal verification using proof assistants, such as Coq, is an effective way of improving software quality, but requires significant effort and expertise. Machine learning can automatically synthesize proofs, but such tools are able to prove only a fraction of desired software properties. We introduce Cobblestone, a divide-and-conquer approach for proof synthesis. Cobblestone uses a large language model (LLM) to generate potential proofs, uses those proofs to break the problem into simpler parts, automatically identifies which of those parts were successfully proven, and iterates on the remaining parts to build a correct proof that is guaranteed to be sound, despite the reliance on unsound LLMs. We evaluate Cobblestone on four benchmarks of open-source Coq projects, controlling for training data leakage. Fully automatically, Cobblestone outperforms state-of-the-art non-LLM tools, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.