SpecGen: Automated Generation of Formal Program Specifications via Large Language Models
Lezhi Ma, Shangqing Liu, Yi Li, Xiaofei Xie, Lei Bu

TL;DR
SpecGen leverages large language models to automate the generation of formal program specifications, effectively handling complex programs and outperforming existing tools through a two-phase process involving conversational guidance and mutation-based refinement.
Contribution
Introduces SpecGen, a novel LLM-based method for formal specification generation that overcomes limitations of template-based approaches using a two-phase process with mutation and heuristic selection.
Findings
Successfully generated verifiable specifications for 279 out of 385 programs.
Outperformed existing LLM-based approaches and tools like Houdini and Daikon.
Demonstrated comprehensive articulation of program behaviors.
Abstract
Formal program specifications play a crucial role in various stages of software development. However, manually crafting formal program specifications is rather difficult, making the job time-consuming and labor-intensive. It is even more challenging to write specifications that correctly and comprehensively describe the semantics of complex programs. To reduce the burden on software developers, automated specification generation methods have emerged. However, existing methods usually rely on predefined templates or grammar, making them struggle to accurately describe the behavior and functionality of complex real-world programs. To tackle this challenge, we introduce SpecGen, a novel technique for formal program specification generation based on Large Language Models. Our key insight is to overcome the limitations of existing methods by leveraging the code comprehension capability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Scientific Computing and Data Management · Software Testing and Debugging Techniques
