Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward

Senkang Hu; Yong Dai; Yuzhi Zhao; Yihang Tao; Yu Guo; Zhengru Fang; Sam Tak Wu Kwong; Yuguang Fang

arXiv:2602.00845·cs.AI·February 10, 2026

Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward

Senkang Hu, Yong Dai, Yuzhi Zhao, Yihang Tao, Yu Guo, Zhengru Fang, Sam Tak Wu Kwong, Yuguang Fang

PDF

Open Access

TL;DR

This paper introduces InfoReasoner, a framework that improves agentic reasoning in large models by using a synthetic semantic information gain reward, leading to better retrieval strategies and higher accuracy in question-answering tasks.

Contribution

It presents a theoretically grounded, scalable method for optimizing retrieval in large reasoning models using a novel intrinsic reward based on semantic information gain.

Findings

01

Achieves up to 5.4% accuracy improvement on seven QA benchmarks.

02

Provides a formal definition and guarantees for information gain as uncertainty reduction.

03

Demonstrates effective training without manual retrieval annotations.

Abstract

Agentic reasoning enables large reasoning models (LRMs) to dynamically acquire external knowledge, but yet optimizing the retrieval process remains challenging due to the lack of dense, principled reward signals. In this paper, we introduce InfoReasoner, a unified framework that incentivizes effective information seeking via a synthetic semantic information gain reward. Theoretically, we redefine information gain as uncertainty reduction over the model's belief states, establishing guarantees, including non-negativity, telescoping additivity, and channel monotonicity. Practically, to enable scalable optimization without manual retrieval annotations, we propose an output-aware intrinsic estimator that computes information gain directly from the model's output distributions using semantic clustering via bidirectional textual entailment. This intrinsic reward guides the policy to maximize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Information Retrieval and Search Behavior