LLM For Loop Invariant Generation and Fixing: How Far Are We?
Mostafijur Rahman Akhond, Saikat Chakraborty, and Gias Uddin

TL;DR
This paper empirically evaluates the ability of various Large Language Models to infer and repair loop invariants in programs, revealing moderate success rates that improve with additional context but remain limited in repair capabilities.
Contribution
It provides the first comprehensive assessment of LLMs' performance in inferring and fixing loop invariants, highlighting their potential and current limitations.
Findings
LLMs achieve up to 78% success in invariant inference.
LLMs only repair invariants successfully 16% of the time.
Auxiliary information significantly improves LLM performance.
Abstract
A loop invariant is a property of a loop that remains true before and after each execution of the loop. The identification of loop invariants is a critical step to support automated program safety assessment. Recent advancements in Large Language Models (LLMs) have demonstrated potential in diverse software engineering (SE) and formal verification tasks. However, we are not aware of the performance of LLMs to infer loop invariants. We report an empirical study of both open-source and closed-source LLMs of varying sizes to assess their proficiency in inferring inductive loop invariants for programs and in fixing incorrect invariants. Our findings reveal that while LLMs exhibit some utility in inferring and repairing loop invariants, their performance is substantially enhanced when supplemented with auxiliary information such as domain knowledge and illustrative examples. LLMs achieve a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Security and Verification in Computing · Formal Methods in Verification
