SpecPylot: Python Specification Generation using Large Language Models
Ragib Shahariar Ayon, Shibbir Ahmed

TL;DR
SpecPylot leverages large language models and symbolic execution to automatically generate, validate, and refine executable Python specifications, improving program correctness verification.
Contribution
It introduces a novel tool that synthesizes and verifies Python specifications using LLMs and symbolic execution, addressing common issues in specification generation.
Findings
SpecPylot generates crosshair-compatible contracts for most Python programs.
The tool effectively refines specifications based on counterexamples from symbolic execution.
Limitations include bounded exploration and variability in LLM outputs.
Abstract
Automatically generating formal specifications could reduce the effort needed to improve program correctness, but in practice, this is still challenging. Many developers avoid writing contracts by hand, which limits the use of automated verification tools. Recent large language models (LLMs) can generate specifications from code, but these specifications often fail in terms of verification. The reason is syntax errors, overly strict constraints, or mismatches with program behavior. We present SpecPylot, a Python tool that synthesizes executable specifications for Python programs as icontract annotations and checks them using crosshair's symbolic execution. The tool relies on LLMs to propose candidate contracts and uses crosshair to validate them. When crosshair finds a concrete counterexample, SpecPylot updates only the generated contracts and leaves the program itself untouched. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
