Large Language Model Agents Are Not Always Faithful Self-Evolvers

Weixiang Zhao; Yingshuo Wang; Yichen Zhang; Yang Deng; Yanyan Zhao; Wanxiang Che; Bing Qin; Ting Liu

arXiv:2601.22436·cs.CL·February 10, 2026

Large Language Model Agents Are Not Always Faithful Self-Evolvers

Weixiang Zhao, Yingshuo Wang, Yichen Zhang, Yang Deng, Yanyan Zhao, Wanxiang Che, Bing Qin, Ting Liu

PDF

Open Access

TL;DR

This paper systematically investigates whether self-evolving LLM agents reliably depend on their experience, revealing a significant gap in their use of condensed experience and highlighting underlying causes.

Contribution

It provides the first comprehensive causal analysis of experience faithfulness in self-evolving LLM agents across multiple frameworks and environments.

Findings

01

Agents depend on raw experience but often ignore condensed experience.

02

The gap persists across different scales and configurations.

03

Underlying causes include semantic limitations, processing biases, and task regimes.

Abstract

Self-evolving large language model (LLM) agents continually improve by accumulating and reusing past experience, yet it remains unclear whether they faithfully rely on that experience to guide their behavior. We present the first systematic investigation of experience faithfulness, the causal dependence of an agent's decisions on the experience it is given, in self-evolving LLM agents. Using controlled causal interventions on both raw and condensed forms of experience, we comprehensively evaluate four representative frameworks across 10 LLM backbones and 9 environments. Our analysis uncovers a striking asymmetry: while agents consistently depend on raw experience, they often disregard or misinterpret condensed experience, even when it is the only experience provided. This gap persists across single- and multi-agent configurations and across backbone scales. We trace its underlying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Topic Modeling · Multimodal Machine Learning Applications