Agentic LLMs for REST API Test Amplification: A Comparative Study Across Cloud Applications

Jarne Besjes; Robbe Nooyens; Tolgahan Bardakci; Mutlu Beyazit; Serge Demeyer

arXiv:2510.27417·cs.SE·November 3, 2025

Agentic LLMs for REST API Test Amplification: A Comparative Study Across Cloud Applications

Jarne Besjes, Robbe Nooyens, Tolgahan Bardakci, Mutlu Beyazit, Serge Demeyer

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of agentic Large Language Models in amplifying REST API test suites across multiple cloud applications, demonstrating improved coverage and defect detection with considerations of computational costs.

Contribution

It introduces and compares single and multi-agent LLM configurations for REST API test amplification, showing their ability to generalize and improve testing coverage in cloud environments.

Findings

01

Increased endpoint and parameter coverage

02

Effective defect detection across heterogeneous APIs

03

Trade-offs between accuracy, scalability, and efficiency

Abstract

Representational State Transfer (REST) APIs are a cornerstone of modern cloud native systems. Ensuring their reliability demands automated test suites that exercise diverse and boundary level behaviors. Nevertheless, designing such test cases remains a challenging and resource intensive endeavor. This study extends prior work on Large Language Model (LLM) based test amplification by evaluating single agent and multi agent configurations across four additional cloud applications. The amplified test suites maintain semantic validity with minimal human intervention. The results demonstrate that agentic LLM systems can effectively generalize across heterogeneous API architectures, increasing endpoint and parameter coverage while revealing defects. Moreover, a detailed analysis of computational cost, runtime, and energy consumption highlights trade-offs between accuracy, scalability, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques · Big Data and Digital Economy