Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward
Senkang Hu, Yong Dai, Yuzhi Zhao, Yihang Tao, Yu Guo, Zhengru Fang, Sam Tak Wu Kwong, Yuguang Fang

TL;DR
This paper introduces InfoReasoner, a framework that improves agentic reasoning in large models by using a synthetic semantic information gain reward, leading to better retrieval strategies and higher accuracy in question-answering tasks.
Contribution
It presents a theoretically grounded, scalable method for optimizing retrieval in large reasoning models using a novel intrinsic reward based on semantic information gain.
Findings
Achieves up to 5.4% accuracy improvement on seven QA benchmarks.
Provides a formal definition and guarantees for information gain as uncertainty reduction.
Demonstrates effective training without manual retrieval annotations.
Abstract
Agentic reasoning enables large reasoning models (LRMs) to dynamically acquire external knowledge, but yet optimizing the retrieval process remains challenging due to the lack of dense, principled reward signals. In this paper, we introduce InfoReasoner, a unified framework that incentivizes effective information seeking via a synthetic semantic information gain reward. Theoretically, we redefine information gain as uncertainty reduction over the model's belief states, establishing guarantees, including non-negativity, telescoping additivity, and channel monotonicity. Practically, to enable scalable optimization without manual retrieval annotations, we propose an output-aware intrinsic estimator that computes information gain directly from the model's output distributions using semantic clustering via bidirectional textual entailment. This intrinsic reward guides the policy to maximize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Information Retrieval and Search Behavior
