Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Chunqiu Steven Xia, Zhe Wang, Yan Yang, Yuxiang Wei, Lingming Zhang

TL;DR
Live-SWE-agent is a pioneering software agent that autonomously and continuously evolves itself during runtime to solve real-world software engineering problems, achieving state-of-the-art performance without offline training.
Contribution
It introduces the first live, self-evolving software agent that adapts its scaffold on-the-fly, surpassing existing agents in solving SWE benchmarks.
Findings
Achieves 77.4% solve rate on SWE-bench Verified.
Outperforms all existing agents without test-time scaling.
Attains 45.8% solve rate on SWE-Bench Pro, the best-known result.
Abstract
Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are typically equipped with a suite of coding tools and can autonomously decide the next actions to form complete trajectories to solve end-to-end software tasks. While promising, they typically require dedicated design and may still be suboptimal, since it can be extremely challenging and costly to exhaust the entire agent scaffold design space. Recognizing that software agents are inherently software themselves that can be further refined/modified, researchers have proposed a number of self-improving software agents recently, including the Darwin-G\"odel Machine (DGM). Meanwhile, such self-improving agents require costly offline training on specific benchmarks and may not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- livesweagent/claude-sonnet-4-5_swebench_verified_trajdataset· 105 dl105 dl
- livesweagent/gpt-5_swebench_verified_trajdataset· 30 dl30 dl
- livesweagent/gpt-5-mini_swebench_verified_trajdataset· 16 dl16 dl
- livesweagent/claude-sonnet-4-5_swebench_pro_trajdataset· 11 dl11 dl
- livesweagent/gemini_3_pro_swebench_verified_trajdataset· 44 dl44 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability
