# LLM Agents for Generating Microservice-based Applications: how complex is your specification?

**Authors:** Daniel M. Yellin

arXiv: 2508.20119 · 2025-10-28

## TL;DR

This paper evaluates the ability of LLM Agents to generate microservice-based application code, introducing a difficulty metric and analyzing challenges faced by LLMs in handling complex specifications.

## Contribution

It proposes a standard specification template and a difficulty scoring metric, and demonstrates how code generation quality varies with specification complexity.

## Key findings

- LLM Agents perform well on medium difficulty specifications.
- Performance drops significantly on high difficulty specifications.
- Fine-grained code generation improves correctness.

## Abstract

In this paper we evaluate the capabilities of LLM Agents in generating code for real-world problems. Specifically, we explore code synthesis for microservice-based applications, a widely used architectural pattern for building applications. We define a standard template for specifying these applications, and we propose a metric for scoring the difficulty of a specification. The higher the score, the more difficult it is to generate code for the specification. Our experimental results show that agents using strong LLMs (like GPT-3o-mini) do fairly well on medium difficulty specifications but do poorly on those of higher difficulty levels. This is due to more intricate business logic, a greater use of external services, database integration and inclusion of non-functional capabilities such as authentication. We analyzed the errors in LLM-synthesized code and report on the key challenges LLM Agents face in generating code for these specifications. Finally, we show that using a fine-grained approach to code generation improves the correctness of the generated code.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20119/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20119/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/2508.20119/full.md

---
Source: https://tomesphere.com/paper/2508.20119