FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration
Zhengding Hu, Mingge Lu, Zhen Wang, Jixuan Ruan, Chang Chen, Zaifeng Pan, Yue Guan, Ruiyi Wang, Zhongkai Yu, Chao Zhang, Yufei Ding

TL;DR
FlashEvolve introduces an asynchronous framework for LLM-based agent evolution, significantly reducing wall-clock time by overlapping stages and managing data staleness through version tracking and repair policies.
Contribution
It replaces synchronized stage execution with asynchronous workers, enabling overlapping and improving throughput in agent evolution workflows.
Findings
Proposal throughput increased by 3.5x on local vLLM
Proposal throughput increased by 4.9x on API serving
Effective handling of language-space staleness improves efficiency
Abstract
LLM-based evolution has emerged as a promising way to improve agents by refining non-parametric artifacts, but its wall-clock cost remains a major bottleneck. We identify that this cost comes from synchronized stage execution and imbalance inside each LLM-heavy stage. We present FlashEvolve, an efficient framework that replaces synchronized execution with asynchronous workers and queues, allowing different stages and steps to overlap. To handle data staleness introduced by asynchrony, FlashEvolve tracks artifact versions and applies different policies to update, discard, or patch stale artifacts. Unlike weight-space staleness in asynchronous RL, language-space staleness is inspectable and repairable: a stale artifact is not just delayed work, but readable evidence that the LLM can reflect on, revise, and turn into useful evolution signal. FlashEvolve further improves throughput and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
