On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents

Deyu Zou; Yongqiang Chen; Fan Feng; Mufei Li; Pan Li; Yu Gong; James Cheng

arXiv:2603.12109·cs.AI·March 13, 2026

On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents

Deyu Zou, Yongqiang Chen, Fan Feng, Mufei Li, Pan Li, Yu Gong, James Cheng

PDF

Open Access

TL;DR

This paper investigates the problem of information self-locking in reinforcement learning-trained LLM agents during active reasoning, identifies core causes, and proposes a method to improve information exploration, resulting in significant performance gains.

Contribution

The paper introduces a decomposition of active reasoning into Action Selection and Belief Tracking, and proposes a critique-based approach to mitigate information self-locking in RL-trained LLM agents.

Findings

01

Up to 60% improvement in mitigating information self-locking

02

Identified feedback loop limiting information exploration

03

Effective approach across 7 datasets

Abstract

Reinforcement learning (RL) with outcome-based rewards has achieved significant success in training large language model (LLM) agents for complex reasoning tasks. However, in active reasoning where agents need to strategically ask questions to acquire task-relevant information, we find that LLM agents trained with RL often suffer from information self-locking: the agent ceases to ask informative questions and struggles to internalize already-obtained information. To understand the phenomenon, we decompose active reasoning into two core capabilities: Action Selection (AS), which determines the observation stream through queries, and Belief Tracking (BT), which updates the agent's belief based on collected evidence. We show that deficient AS and BT capabilities will limit the information exploration during RL training. Furthermore, insufficient exploration in turn hinders the improvement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)