Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Huichi Zhou; Yihang Chen; Siyuan Guo; Xue Yan; Kin Hei Lee; Zihan Wang; Ka Yiu Lee; Guchun Zhang; Kun Shao; Linyi Yang; Jun Wang

arXiv:2508.16153·cs.LG·August 26, 2025

Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Huichi Zhou, Yihang Chen, Siyuan Guo, Xue Yan, Kin Hei Lee, Zihan Wang, Ka Yiu Lee, Guchun Zhang, Kun Shao, Linyi Yang, Jun Wang

PDF

TL;DR

Memento introduces a memory-based online reinforcement learning approach for LLM agents that enables continual adaptation without fine-tuning the models, achieving state-of-the-art results in research and out-of-distribution tasks.

Contribution

It presents a novel memory-augmented MDP framework allowing LLM agents to adapt continually without gradient fine-tuning, outperforming existing training-based methods.

Findings

01

Achieves top-1 on GAIA validation with 87.88% Pass@3

02

Reaches 79.40% on the test set, outperforming state-of-the-art

03

Outperforms training-based methods on out-of-distribution tasks

Abstract

In this paper, we introduce a novel learning paradigm for Adaptive Large Language Model (LLM) agents that eliminates the need for fine-tuning the underlying LLMs. Existing approaches are often either rigid, relying on static, handcrafted reflection workflows, or computationally intensive, requiring gradient updates of LLM model parameters. In contrast, our method enables low-cost continual adaptation via memory-based online reinforcement learning. We formalise this as a Memory-augmented Markov Decision Process (M-MDP), equipped with a neural case-selection policy to guide action decisions. Past experiences are stored in an episodic memory, either differentiable or non-parametric. The policy is continually updated based on environmental feedback through a memory rewriting mechanism, whereas policy improvement is achieved through efficient memory reading (retrieval). We instantiate our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.