Minimum information divergence of Q-functions for dynamic treatment   resumes

Shinto Eguchi

arXiv:2211.08741·stat.ME·November 17, 2022

Minimum information divergence of Q-functions for dynamic treatment resumes

Shinto Eguchi

PDF

Open Access

TL;DR

This paper introduces a novel information geometric approach to reinforcement learning for dynamic treatment resumes, proposing a minimum divergence estimator for optimal policies based on Q-functions and demonstrating its effectiveness through numerical experiments.

Contribution

It develops a new framework using information divergence and geometric means to estimate optimal policies in reinforcement learning for dynamic treatments.

Findings

01

The $ ext{γ}$-power divergence vanishes for policy-equivalent Q-functions.

02

The proposed method effectively estimates policies in dynamic treatment regimes.

03

Numerical experiments show improved performance of the minimum divergence approach.

Abstract

This paper aims at presenting a new application of information geometry to reinforcement learning focusing on dynamic treatment resumes. In a standard framework of reinforcement learning, a Q-function is defined as the conditional expectation of a reward given a state and an action for a single-stage situation. We introduce an equivalence relation, called the policy equivalence, in the space of all the Q-functions. A class of information divergence is defined in the Q-function space for every stage. The main objective is to propose an estimator of the optimal policy function by a method of minimum information divergence based on a dataset of trajectories. In particular, we discuss the $γ$ -power divergence that is shown to have an advantageous property such that the $γ$ -power divergence between policy-equivalent Q-functions vanishes. This property essentially works to seek the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Advanced Causal Inference Techniques · Receptor Mechanisms and Signaling