CoverUp: Effective High Coverage Test Generation for Python
Juan Altmayer Pizzorno, Emery D. Berger

TL;DR
CoverUp is a new method that uses coverage analysis, code context, and feedback prompts to guide large language models in generating high-coverage Python tests, significantly outperforming existing tools.
Contribution
It introduces a novel approach combining feedback-driven prompts with LLMs to generate high-coverage Python regression tests, advancing automated test generation techniques.
Findings
CoverUp achieves 80% median line+branch coverage per module, outperforming CodaMosa.
CoverUp reaches 89% overall line+branch coverage, surpassing MuTAP.
Performance benefits are due to the integrated components of CoverUp, not just the LLM.
Abstract
Testing is an essential part of software development. Test generation tools attempt to automate the otherwise labor-intensive task of test creation, but generating high-coverage tests remains challenging. This paper proposes CoverUp, a novel approach to driving the generation of high-coverage Python regression tests. CoverUp combines coverage analysis, code context, and feedback in prompts that iteratively guide the LLM to generate tests that improve line and branch coverage. We evaluate our prototype CoverUp implementation across a benchmark of challenging code derived from open-source Python projects and show that CoverUp substantially improves on the state of the art. Compared to CodaMosa, a hybrid search/LLM-based test generator, CoverUp achieves a per-module median line+branch coverage of 80% (vs. 47%). Compared to MuTAP, a mutation- and LLM-based test generator, CoverUp achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Real-time simulation and control systems
MethodsFocus
