AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering

Di Zhang

arXiv:2601.04620·cs.AI·January 9, 2026

AgentDevel: Reframing Self-Evolving LLM Agents as Release Engineering

Di Zhang

PDF

Open Access

TL;DR

AgentDevel introduces a release engineering approach for LLM agents, externalizing improvement into a regression-aware pipeline that ensures stable, auditable, and non-regressive updates, contrasting with traditional self-improvement methods.

Contribution

It presents AgentDevel, a novel release engineering pipeline for LLM agents that emphasizes non-regression, external diagnostics, and stable iterative improvements.

Findings

01

Yields stable improvements with fewer regressions.

02

Produces reproducible, auditable artifacts.

03

Maintains a single canonical version line.

Abstract

Recent progress in large language model (LLM) agents has largely focused on embedding self-improvement mechanisms inside the agent or searching over many concurrent variants. While these approaches can raise aggregate scores, they often yield unstable and hard-to-audit improvement trajectories, making it difficult to guarantee non-regression or to reason about failures across versions. We reframe agent improvement as \textbf{release engineering}: agents are treated as shippable artifacts, and improvement is externalized into a regression-aware release pipeline. We introduce \textbf{AgentDevel}, a release engineering pipeline that iteratively runs the current agent, produces implementation-blind, symptom-level quality signals from execution traces, synthesizes a single release candidate (RC) via executable diagnosis, and promotes it under flip-centered gating. AgentDevel features three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Software Engineering Techniques and Practices