Reinforcement Learning with Negative Tests as Completeness Signal for Formal Specification Synthesis
Zhechong Huang, Zhao Zhang, Zeyu Sun, Huifeng Sun, Yingfei Xiong

TL;DR
This paper introduces SpecRL, a reinforcement learning framework that uses negative tests as a signal to improve the synthesis of formal specifications, enhancing verification success.
Contribution
SpecRL innovatively incorporates negative test generation into RL for specification synthesis, providing a fine-grained completeness signal that outperforms prior methods.
Findings
SpecRL improves specification strength and verification success.
It generalizes well to out-of-distribution benchmarks.
It remains competitive with larger LLMs on unseen data.
Abstract
The specification synthesis task aims to automatically generate specifications, together with any necessary auxiliary verification annotations, for existing programs. This task is important because such specifications serve as behavioral contracts that support modular reasoning and reusable verification across a codebase. At the same time, it remains challenging because verifier-only feedback is fundamentally incomplete: passing verification establishes soundness, but cannot distinguish weak specifications from strong ones. What is missing is a fine-grained signal for specification completeness. We present SpecRL, a reinforcement learning framework for specification synthesis in Dafny. SpecRL introduces a self-contained pipeline that generates negative tests, i.e., input-output pairs that can never be produced by the program. We use the fraction of these negative tests rejected by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
