Information-Theoretic Intrinsic Motivation for Reinforcement Learning in Combinatorial Routing
Ruozhang Xi, Yao Ni, Wangyu Wu

TL;DR
This paper introduces a new intrinsic motivation method for reinforcement learning that improves exploration in complex routing problems using information theory.
Contribution
A novel information-theoretic framework for intrinsic motivation using the Information Bottleneck principle in combinatorial state spaces.
Findings
The method improves exploration efficiency in high-dimensional routing problems.
It achieves better training stability and solution quality compared to standard RL baselines.
Neural mutual information estimators enable scalable implementation without explicit density modeling.
Abstract
Intrinsic motivation provides a principled mechanism for driving exploration in reinforcement learning when external rewards are sparse or delayed. A central challenge, however, lies in defining meaningful novelty signals in high-dimensional and combinatorial state spaces, where observation-level density estimation and prediction-error heuristics often become unreliable. In this work, we propose an information-theoretic framework for intrinsically motivated reinforcement learning grounded in the Information Bottleneck principle. Our approach learns compact latent state representations by explicitly balancing the compression of observations and the preservation of predictive information about future state transitions. Within this bottlenecked latent space, intrinsic rewards are defined through information-theoretic quantities that characterize the novelty of state–action transitions in…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Vehicle Routing Optimization Methods · Traffic control and management
