Ranking LLM-Generated Loop Invariants for Program Verification

Saikat Chakraborty; Shuvendu K. Lahiri; Sarah Fakhoury; Madanlal; Musuvathi; Akash Lal; Aseem Rastogi; Aditya Senthilnathan; Rahul Sharma,; Nikhil Swamy

arXiv:2310.09342·cs.PL·February 14, 2024·1 cites

Ranking LLM-Generated Loop Invariants for Program Verification

Saikat Chakraborty, Shuvendu K. Lahiri, Sarah Fakhoury, Madanlal, Musuvathi, Akash Lal, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma,, Nikhil Swamy

PDF

Open Access 1 Repo

TL;DR

This paper introduces a re-ranking method for Large Language Model-generated loop invariants that improves the selection of correct invariants, reducing verifier calls and enhancing program verification efficiency.

Contribution

We propose a contrastive re-ranking approach that effectively distinguishes correct invariants from incorrect ones generated by LLMs, improving verification performance.

Findings

01

Re-ranking significantly increases correct invariant selection

02

Reduces number of verifier calls needed

03

Enhances automation in program verification

Abstract

Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a {\it re-ranking} approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/NeuralInvariantRanker
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Natural Language Processing Techniques