Minimum information divergence of Q-functions for dynamic treatment resumes
Shinto Eguchi

TL;DR
This paper introduces a novel information geometric approach to reinforcement learning for dynamic treatment resumes, proposing a minimum divergence estimator for optimal policies based on Q-functions and demonstrating its effectiveness through numerical experiments.
Contribution
It develops a new framework using information divergence and geometric means to estimate optimal policies in reinforcement learning for dynamic treatments.
Findings
The $ ext{γ}$-power divergence vanishes for policy-equivalent Q-functions.
The proposed method effectively estimates policies in dynamic treatment regimes.
Numerical experiments show improved performance of the minimum divergence approach.
Abstract
This paper aims at presenting a new application of information geometry to reinforcement learning focusing on dynamic treatment resumes. In a standard framework of reinforcement learning, a Q-function is defined as the conditional expectation of a reward given a state and an action for a single-stage situation. We introduce an equivalence relation, called the policy equivalence, in the space of all the Q-functions. A class of information divergence is defined in the Q-function space for every stage. The main objective is to propose an estimator of the optimal policy function by a method of minimum information divergence based on a dataset of trajectories. In particular, we discuss the -power divergence that is shown to have an advantageous property such that the -power divergence between policy-equivalent Q-functions vanishes. This property essentially works to seek the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis · Advanced Causal Inference Techniques · Receptor Mechanisms and Signaling
