Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
Zhiyao Luo, Yangchen Pan, Peter Watkinson, Tingting Zhu

TL;DR
This paper critically examines the use of offline reinforcement learning in dynamic treatment regimes, highlighting evaluation inconsistencies, the importance of baselines, and the variability in algorithm performance, urging for more rigorous assessment methods.
Contribution
It provides a comprehensive case study demonstrating the variability in RL performance for DTRs and calls for reassessment of current evaluation practices and formulations.
Findings
RL performance varies significantly with evaluation metrics and MDP formulations.
In some cases, RL algorithms are outperformed by random baselines.
Current evaluation practices may lead to inconclusive or misleading results.
Abstract
In the rapidly changing healthcare landscape, the implementation of offline reinforcement learning (RL) in dynamic treatment regimes (DTRs) presents a mix of unprecedented opportunities and challenges. This position paper offers a critical examination of the current status of offline RL in the context of DTRs. We argue for a reassessment of applying RL in DTRs, citing concerns such as inconsistent and potentially inconclusive evaluation metrics, the absence of naive and supervised learning baselines, and the diverse choice of RL formulation in existing research. Through a case study with more than 17,000 evaluation experiments using a publicly available Sepsis dataset, we demonstrate that the performance of RL algorithms can significantly vary with changes in evaluation metrics and Markov Decision Process (MDP) formulations. Surprisingly, it is observed that in some instances, RL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques
