Agentic LLMs for REST API Test Amplification: A Comparative Study Across Cloud Applications
Jarne Besjes, Robbe Nooyens, Tolgahan Bardakci, Mutlu Beyazit, Serge Demeyer

TL;DR
This paper evaluates the effectiveness of agentic Large Language Models in amplifying REST API test suites across multiple cloud applications, demonstrating improved coverage and defect detection with considerations of computational costs.
Contribution
It introduces and compares single and multi-agent LLM configurations for REST API test amplification, showing their ability to generalize and improve testing coverage in cloud environments.
Findings
Increased endpoint and parameter coverage
Effective defect detection across heterogeneous APIs
Trade-offs between accuracy, scalability, and efficiency
Abstract
Representational State Transfer (REST) APIs are a cornerstone of modern cloud native systems. Ensuring their reliability demands automated test suites that exercise diverse and boundary level behaviors. Nevertheless, designing such test cases remains a challenging and resource intensive endeavor. This study extends prior work on Large Language Model (LLM) based test amplification by evaluating single agent and multi agent configurations across four additional cloud applications. The amplified test suites maintain semantic validity with minimal human intervention. The results demonstrate that agentic LLM systems can effectively generalize across heterogeneous API architectures, increasing endpoint and parameter coverage while revealing defects. Moreover, a detailed analysis of computational cost, runtime, and energy consumption highlights trade-offs between accuracy, scalability, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques · Big Data and Digital Economy
