The Rise of Agentic Testing: Multi-Agent Systems for Robust Software Quality Assurance

Saba Naqvi; Mohammad Baqar; Nawaz Ali Mohammad

arXiv:2601.02454·cs.SE·January 7, 2026

The Rise of Agentic Testing: Multi-Agent Systems for Robust Software Quality Assurance

Saba Naqvi, Mohammad Baqar, Nawaz Ali Mohammad

PDF

Open Access

TL;DR

This paper presents an innovative multi-agent testing framework that autonomously generates, executes, and refines software tests through a feedback loop, significantly improving test validity and coverage while reducing human effort.

Contribution

It introduces a novel multi-agent, feedback-driven testing system that enhances automation and adaptability in software quality assurance processes.

Findings

01

Up to 60% reduction in invalid tests

02

30% increase in test coverage

03

Reduced human effort in testing processes

Abstract

Software testing has progressed toward intelligent automation, yet current AI-based test generators still suffer from static, single-shot outputs that frequently produce invalid, redundant, or non-executable tests due to the lack of execution aware feedback. This paper introduces an agentic multi-model testing framework a closed-loop, self-correcting system in which a Test Generation Agent, an Execution and Analysis Agent, and a Review and Optimization Agent collaboratively generate, execute, analyze, and refine tests until convergence. By using sandboxed execution, detailed failure reporting, and iterative regeneration or patching of failing tests, the framework autonomously improves test quality and expands coverage. Integrated into a CI/CD-compatible pipeline, it leverages reinforcement signals from coverage metrics and execution outcomes to guide refinement. Empirical evaluations on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software System Performance and Reliability · Software Reliability and Analysis Research