SAGE:Specification-Aware Grammar Extraction for Automated Test Case Generation with LLMs

Aditi; Hyunwoo Park; Sicheol Sung; Yo-Sub Han; Sang-Ki Ko

arXiv:2506.11081·cs.CL·June 16, 2025

SAGE:Specification-Aware Grammar Extraction for Automated Test Case Generation with LLMs

Aditi, Hyunwoo Park, Sicheol Sung, Yo-Sub Han, Sang-Ki Ko

PDF

Open Access

TL;DR

This paper introduces SAGE, a method that uses large language models and reinforcement learning to automatically generate valid, general grammars from natural language specifications, significantly improving test case generation quality.

Contribution

The work presents a novel approach combining LLM fine-tuning and reward-guided reinforcement learning to induce context-free grammars with counters from specifications, enhancing validity and generality.

Findings

01

SAGE outperforms 17 LLMs in grammar validity and test effectiveness.

02

The approach improves state-of-the-art by over 15% in grammar validity.

03

Iterative feedback enhances grammar correction for syntactic and semantic errors.

Abstract

Grammar-based test case generation has proven effective for competitive programming problems, but generating valid and general grammars from natural language specifications remains a key challenge, especially under limited supervision. Context-Free Grammars with Counters (CCFGs) have recently been introduced as a formalism to represent such specifications with logical constraints by storing and reusing counter values during derivation. In this work, we explore the use of open-source large language models (LLMs) to induce CCFGs from specifications using a small number of labeled examples and verifiable reward-guided reinforcement learning. Our approach first fine-tunes an open-source LLM to perform specification-to-grammar translation, and further applies Group Relative Policy Optimization (GRPO) to enhance grammar validity and generality. We also examine the effectiveness of iterative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software System Performance and Reliability · Model-Driven Software Engineering Techniques