SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Siwei Liu; Jinyuan Fang; Han Zhou; Yingxu Wang; Zaiqiao Meng

arXiv:2505.18646·cs.SE·April 15, 2026

SEW: Self-Evolving Agentic Workflows for Automated Code Generation

Siwei Liu, Jinyuan Fang, Han Zhou, Yingxu Wang, Zaiqiao Meng

PDF

TL;DR

SEW is a novel framework that automatically creates and optimizes multi-agent workflows for code generation, improving performance on coding benchmarks without manual design.

Contribution

It introduces a self-evolving system that automates workflow design and optimization, reducing reliance on manual configuration in multi-agent code generation.

Findings

01

SEW achieves up to 12% improvement on LiveCodeBench.

02

It can automatically design effective agentic workflows.

03

Insights into optimal workflow encoding schemes are provided.

Abstract

Large Language Models (LLMs) have demonstrated effectiveness in code generation tasks. To enable LLMs to address more complex coding challenges, existing research has focused on crafting multi-agent systems with agentic workflows, where complex coding tasks are decomposed into sub-tasks, assigned to specialized agents. Despite their effectiveness, current approaches heavily rely on hand-crafted agentic workflows, with both agent topologies and prompts manually designed, which limits their ability to automatically adapt to different types of coding problems. To address these limitations and enable automated workflow design, we propose \textbf{S}elf-\textbf{E}volving \textbf{W}orkflow (\textbf{SEW}), a novel self-evolving framework that automatically generates and optimises multi-agent workflows. Extensive experiments on three coding benchmark datasets, including the challenging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.