Planning to Explore: Curiosity-Driven Planning for LLM Test Generation
Alfonso Amayuelas, Firas Laakom, Piotr Pi\k{e}kos, Wenyi Wang, Yifan Xu, Yuhui Wang, J\"urgen Schmidhuber, William Wang

TL;DR
This paper introduces CovQValue, a curiosity-driven planning method for LLM-based code test generation that outperforms greedy strategies by balancing immediate coverage with future reachability, leading to higher branch coverage.
Contribution
It presents a novel Bayesian exploration approach for LLM test generation, incorporating coverage maps and Q-values to improve exploration efficiency.
Findings
CovQValue achieves 51-77% higher branch coverage than greedy methods.
The approach outperforms existing strategies on TestGenEval Lite across three LLMs.
The method demonstrates effective exploration in the new RepoExploreBench benchmark.
Abstract
The use of LLMs for code generation has naturally extended to code testing and evaluation. As codebases grow in size and complexity, so does the need for automated test generation. Current approaches for LLM-based test generation rely on strategies that maximize immediate coverage gain, a greedy approach that plateaus on code where reaching deep branches requires setup steps that individually yield zero new coverage. Drawing on principles of Bayesian exploration, we treat the program's branch structure as an unknown environment, and an evolving coverage map as a proxy probabilistic posterior representing what the LLM has discovered so far. Our method, CovQValue, feeds the coverage map back to the LLM, generates diverse candidate plans in parallel, and selects the most informative plan by LLM-estimated Q-values, seeking actions that balance immediate branch discovery with future…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
