TL;DR
The paper introduces Agent Evolving Learning (AEL), a framework enabling LLM agents to improve over open-ended tasks by learning how to utilize memory and reflection effectively, leading to superior performance.
Contribution
AEL is a novel two-timescale approach that combines learned retrieval policies and reflection-driven insights to enhance agent self-improvement in open-ended environments.
Findings
AEL outperforms existing self-improving methods and baselines on a financial portfolio benchmark.
Memory and reflection mechanisms together yield a 58% improvement over stateless agents.
Adding extra mechanisms degrades performance, highlighting the importance of effective self-diagnosis.
Abstract
LLM agents increasingly operate in open-ended environments spanning hundreds of sequential episodes, yet they remain largely stateless: each task is solved from scratch without converting past experience into better future behavior. The central obstacle is not \emph{what} to remember but \emph{how to use} what has been remembered, including which retrieval policy to apply, how to interpret prior outcomes, and when the current strategy itself must change. We introduce \emph{Agent Evolving Learning} (\ael{}), a two-timescale framework that addresses this obstacle. At the fast timescale, a Thompson Sampling bandit learns which memory retrieval policy to apply at each episode; at the slow timescale, LLM-driven reflection diagnoses failure patterns and injects causal insights into the agent's decision prompt, giving it an interpretive frame for the evidence it retrieves. On a sequential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
