Do Autonomous Agents Contribute Test Code? A Study of Tests in Agentic Pull Requests
Sabrina Haque, Sarvesh Ingale, Christoph Csallner

TL;DR
This study investigates how autonomous coding agents include tests in their pull requests, revealing trends, differences across agents, and implications for software quality and development workflows.
Contribution
It provides the first empirical analysis of test inclusion in agentic pull requests, highlighting patterns and variations across different autonomous agents.
Findings
Test-containing PRs are increasingly common over time.
Test PRs tend to be larger and take longer to merge.
Merge rates are similar regardless of test inclusion.
Abstract
Testing is a critical practice for ensuring software correctness and long-term maintainability. As agentic coding tools increasingly submit pull requests (PRs), it becomes essential to understand how testing appears in these agent-driven workflows. Using the AIDev dataset, we present an empirical study of test inclusion in agentic pull requests. We examine how often tests are included, when they are introduced during the PR lifecycle and how test-containing PRs differ from non-test PRs in terms of size, turnaround time, and merge outcomes. Across agents, test-containing PRs are more common over time and tend to be larger and take longer to complete, while merge rates remain largely similar. We also observe variation across agents in both test adoption and the balance between test and production code within test PRs. Our findings provide a descriptive view of testing behavior in agentic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Techniques and Practices · Software Engineering Research
