SpecSyn: LLM-based Synthesis and Refinement of Formal Specifications for Real-world Program Verification

Lezhi Ma; Shangqing Liu; Yi Li; Qiong Wu; Han Wang; and Lei Bu

arXiv:2604.21570·cs.SE·April 24, 2026

SpecSyn: LLM-based Synthesis and Refinement of Formal Specifications for Real-world Program Verification

Lezhi Ma, Shangqing Liu, Yi Li, Qiong Wu, Han Wang, and Lei Bu

PDF

TL;DR

SpecSyn leverages large language models to generate and refine formal specifications for real-world program verification, addressing input size and evaluation challenges to improve accuracy and applicability.

Contribution

It introduces a novel LLM-based method with a specification refinement mechanism that enhances the semantic strength of generated specifications for complex programs.

Findings

01

SpecSyn achieves over 90% precision and 75% recall in specification generation.

02

It successfully handles 1071 out of 1365 verification properties in real-world programs.

03

Outperforms existing LLM-based approaches significantly.

Abstract

Program verification is a formal technique to rigorously ensure the correctness and fault-freeness of software systems. However, constructing comprehensive interprocedural specifications for full verification obligations is time-consuming and labor-intensive, giving rise to automated specification generation approaches. Despite the significant advancements in these approaches brought by Large Language Models (LLMs), existing LLM-empowered approaches still suffer from significant limitations: they lack effective strategies for handling sizable input programs, and are typically equipped with no mechanisms to evaluate and guarantee the strength of the generated specifications. The limitations impair their ability to extract precise specifications from real-world complicated programs to support the verification of target properties, thereby hindering the applicability of existing approaches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.