Improving Dynamic Specification Inference with LLM-Generated Counterexamples
Agust\'in Balestra, Agust\'in Nolasco, Facundo Molina, Diego Garbervetsky, Renzo Degiovanni, Nazareno Aguirre

TL;DR
This paper leverages large language models to generate counterexamples that improve the accuracy of dynamically inferred software specifications, reducing false positives and increasing precision.
Contribution
It introduces a novel approach using LLM-generated counterexamples to enhance dynamic specification inference, addressing test suite limitations.
Findings
LLMs can generate effective counterexamples that discard up to 11.68% of invalid assertions.
Incorporating LLM-generated counterexamples improves inference precision by up to 7%.
The approach maintains recall while reducing false positives.
Abstract
Contract assertions, such as preconditions, postconditions, and invariants, play a crucial role in software development, enabling applications such as program verification, test generation, and debugging. Despite their benefits, the adoption of contract assertions is limited, due to the difficulty of manually producing such assertions. Dynamic analysis-based approaches, such as Daikon, can aid in this task by inferring expressive assertions from execution traces. However, a fundamental weakness of these methods is their reliance on the thoroughness of the test suites used for dynamic analysis. When these test suites do not contain sufficiently diverse tests, the inferred assertions are often not generalizable, leading to a high rate of invalid candidates (false positives) that must be manually filtered out. In this paper, we explore the use of large language models (LLMs) to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
