Rango: Adaptive Retrieval-Augmented Proving for Automated Software   Verification

Kyle Thompson; Nuno Saavedra; Pedro Carrott; Kevin Fisher; Alex; Sanchez-Stern; Yuriy Brun; Jo\~ao F. Ferreira; Sorin Lerner; Emily First

arXiv:2412.14063·cs.SE·January 30, 2025

Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification

Kyle Thompson, Nuno Saavedra, Pedro Carrott, Kevin Fisher, Alex, Sanchez-Stern, Yuriy Brun, Jo\~ao F. Ferreira, Sorin Lerner, Emily First

PDF

Open Access 1 Repo

TL;DR

Rango is an automated proof synthesis tool for Coq that leverages retrieval-augmented large language models to identify relevant premises and similar proofs, significantly improving proof automation in software verification.

Contribution

This work introduces Rango, a novel retrieval-augmented proof synthesis system for Coq that adapts to project context and proof state, outperforming previous methods.

Findings

01

Rango synthesizes proofs for 32.0% of theorems, 29% more than Tactician.

02

Adding relevant proofs increases the number of proven theorems by 47%.

03

Created CoqStoq, a large dataset of Coq projects and theorems for training and evaluation.

Abstract

Formal verification using proof assistants, such as Coq, enables the creation of high-quality software. However, the verification process requires significant expertise and manual effort to write proofs. Recent work has explored automating proof synthesis using machine learning and large language models (LLMs). This work has shown that identifying relevant premises, such as lemmas and definitions, can aid synthesis. We present Rango, a fully automated proof synthesis tool for Coq that automatically identifies relevant premises and also similar proofs from the current project and uses them during synthesis. Rango uses retrieval augmentation at every step of the proof to automatically determine which proofs and premises to include in the context of its fine-tuned LLM. In this way, Rango adapts to the project and to the evolving state of the proof. We create a new dataset, CoqStoq, of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rkthomps/coq-modeling
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software System Performance and Reliability · Software Engineering Research