IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning

Yinhan He; Yaochen Zhu; Mingjia Shi; Wendy Zheng; Lin Su; Xiaoqing Wang; Qi Guo; Jundong Li

arXiv:2602.19049·cs.CL·February 24, 2026

IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning

Yinhan He, Yaochen Zhu, Mingjia Shi, Wendy Zheng, Lin Su, Xiaoqing Wang, Qi Guo, Jundong Li

PDF

Open Access

TL;DR

IAPO introduces an information-theoretic framework for token-efficient reasoning in large language models, improving accuracy while reducing reasoning length by focusing on informative reasoning steps through mutual information analysis.

Contribution

It proposes a novel token-wise advantage shaping method based on mutual information, enabling explicit control over reasoning effort distribution during post-training.

Findings

01

Reduces reasoning length by up to 36%

02

Improves reasoning accuracy across datasets

03

Outperforms existing token-efficient RL methods

Abstract

Large language models increasingly rely on long chains of thought to improve accuracy, yet such gains come with substantial inference-time costs. We revisit token-efficient post-training and argue that existing sequence-level reward-shaping methods offer limited control over how reasoning effort is allocated across tokens. To bridge the gap, we propose IAPO, an information-theoretic post-training framework that assigns token-wise advantages based on each token's conditional mutual information (MI) with the final answer. This yields an explicit, principled mechanism for identifying informative reasoning steps and suppressing low-utility exploration. We provide a theoretical analysis showing that our IAPO can induce monotonic reductions in reasoning verbosity without harming correctness. Empirically, IAPO consistently improves reasoning accuracy while reducing reasoning length by up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Multimodal Machine Learning Applications