PPO guided Agentic Pipeline for Adaptive Prompt Selection and Test Case Generation

Gourisetty Venkata Sai Koushik; Dama Aditya; Mahankali Harish Sai; Peddi Siddarhta; Shadab Ahmad; Vivek Yelleti

arXiv:2605.00942·cs.SE·May 5, 2026

PPO guided Agentic Pipeline for Adaptive Prompt Selection and Test Case Generation

Gourisetty Venkata Sai Koushik, Dama Aditya, Mahankali Harish Sai, Peddi Siddarhta, Shadab Ahmad, Vivek Yelleti

PDF

TL;DR

This paper introduces a reinforcement learning-based framework combining PPO and LLMs to optimize prompt selection for automated test case generation, significantly improving code coverage on benchmark programs.

Contribution

The novel integration of PPO with LLMs for adaptive prompt selection in test generation, outperforming existing static prompting methods in coverage metrics.

Findings

01

PPO-LLM achieves higher branch and line coverage than CBMC, kS-LLM, and kS-LLM++.

02

Adaptive prompt selection leads to near 100% branch coverage on some benchmarks.

03

The approach effectively explores unvisited program paths using reinforcement learning guidance.

Abstract

Developing effective test cases capable of thoroughly exercising large-scale software systems is inherently difficult, especially if such systems have voluminous, complex, and deeply nested source codes. In this work, we present a novel approach for generating test cases using a reinforcement learning-driven agentic framework where Proximal Policy Optimization (PPO) is coupled with an LLM engine to guide prompt selection during test generation. Our approach consists of two phases. In Phase I, the ToT-guided optimization agent partitions and minimizes the source code by removing redundancies without changing the functional behavior of the source code. In Phase II, a PPO-based policy network is trained to solve the problem of selecting prompts among eight different prompting techniques, such as Boundary Value Analysis, Random Fuzzing, etc., based on the inputted 11-dimensional state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.