Policy Learning for Individualized Treatment Regimes on Infinite Time Horizon
Wenzhuo Zhou, Yuhan Li, Ruoqing Zhu

TL;DR
This paper reviews statistical methods for reinforcement learning in infinite horizon settings, focusing on personalized treatment policies using real-time data, and discusses their modeling, generalizability, and interpretability.
Contribution
It provides an overview of recent methodologies for policy learning in infinite horizon reinforcement learning, highlighting challenges and future research directions.
Findings
Discusses modeling frameworks for infinite horizon RL
Analyzes generalizability and interpretability issues
Provides use case examples and future research directions
Abstract
With the recent advancements of technology in facilitating real-time monitoring and data collection, "just-in-time" interventions can be delivered via mobile devices to achieve both real-time and long-term management and control. Reinforcement learning formalizes such mobile interventions as a sequence of decision rules and assigns treatment arms based on the user's status at each decision point. In practice, real applications concern a large number of decision points beyond the time horizon of the currently collected data. This usually refers to reinforcement learning in the infinite horizon setting, which becomes much more challenging. This article provides a selective overview of some statistical methodologies on this topic. We discuss their modeling framework, generalizability, and interpretability and provide some use case examples. Some future research directions are discussed in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health Research Topics
