PPO guided Agentic Pipeline for Adaptive Prompt Selection and Test Case Generation
Gourisetty Venkata Sai Koushik, Dama Aditya, Mahankali Harish Sai, Peddi Siddarhta, Shadab Ahmad, Vivek Yelleti

TL;DR
This paper introduces a reinforcement learning-based framework combining PPO and LLMs to optimize prompt selection for automated test case generation, significantly improving code coverage on benchmark programs.
Contribution
The novel integration of PPO with LLMs for adaptive prompt selection in test generation, outperforming existing static prompting methods in coverage metrics.
Findings
PPO-LLM achieves higher branch and line coverage than CBMC, kS-LLM, and kS-LLM++.
Adaptive prompt selection leads to near 100% branch coverage on some benchmarks.
The approach effectively explores unvisited program paths using reinforcement learning guidance.
Abstract
Developing effective test cases capable of thoroughly exercising large-scale software systems is inherently difficult, especially if such systems have voluminous, complex, and deeply nested source codes. In this work, we present a novel approach for generating test cases using a reinforcement learning-driven agentic framework where Proximal Policy Optimization (PPO) is coupled with an LLM engine to guide prompt selection during test generation. Our approach consists of two phases. In Phase I, the ToT-guided optimization agent partitions and minimizes the source code by removing redundancies without changing the functional behavior of the source code. In Phase II, a PPO-based policy network is trained to solve the problem of selecting prompts among eight different prompting techniques, such as Boundary Value Analysis, Random Fuzzing, etc., based on the inputted 11-dimensional state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
