Automated Unit Test Improvement using Large Language Models at Meta
Nadia Alshahwan, Jubin Chheda, Anastasia Finegenova, Beliz Gokkaya,, Mark Harman, Inna Harper, Alexandru Marginean, Shubho Sengupta, Eddy Wang

TL;DR
Meta's TestGen-LLM leverages large language models to automatically improve human-written tests, successfully increasing coverage and reliability in large-scale industrial deployments at Meta.
Contribution
This work introduces TestGen-LLM, a novel LLM-based tool for automatic test improvement with built-in verification, deployed at Meta's scale for the first time.
Findings
75% of generated test cases built correctly
57% of tests passed reliably
25% increase in test coverage
Abstract
This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. We describe the deployment of TestGen-LLM at Meta test-a-thons for the Instagram and Facebook platforms. In an evaluation on Reels and Stories products for Instagram, 75% of TestGen-LLM's test cases built correctly, 57% passed reliably, and 25% increased coverage. During Meta's Instagram and Facebook test-a-thons, it improved 11.5% of all classes to which it was applied, with 73% of its recommendations being accepted for production deployment by Meta software engineers. We believe this is the first report on industrial scale deployment of LLM-generated code backed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Topic Modeling · Natural Language Processing Techniques
