Model Selection for Offline Reinforcement Learning: Practical   Considerations for Healthcare Settings

Shengpu Tang; Jenna Wiens

arXiv:2107.11003·cs.LG·July 26, 2021·23 cites

Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings

Shengpu Tang, Jenna Wiens

PDF

Open Access 1 Repo

TL;DR

This paper evaluates different off-policy evaluation methods for offline reinforcement learning in healthcare, proposing a two-stage model selection approach that balances accuracy and computational efficiency, with a focus on sepsis treatment policies.

Contribution

It provides an in-depth analysis of OPE methods for offline RL in healthcare, identifying FQE as most effective but computationally costly, and introduces a practical two-stage model selection pipeline.

Findings

01

Fitted Q evaluation (FQE) yields the best policy ranking.

02

FQE is computationally intensive compared to other OPE methods.

03

A two-stage approach improves efficiency without sacrificing ranking quality.

Abstract

Reinforcement learning (RL) can be used to learn treatment policies and aid decision making in healthcare. However, given the need for generalization over complex state/action spaces, the incorporation of function approximators (e.g., deep neural networks) requires model selection to reduce overfitting and improve policy performance at deployment. Yet a standard validation pipeline for model selection requires running a learned policy in the actual environment, which is often infeasible in a healthcare setting. In this work, we investigate a model selection pipeline for offline RL that relies on off-policy evaluation (OPE) as a proxy for validation performance. We present an in-depth analysis of popular OPE methods, highlighting the additional hyperparameters and computational requirements (fitting/inference of auxiliary models) when used to rank a set of candidate policies. We compare…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MLD3/OfflineRL_ModelSelection
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSepsis Diagnosis and Treatment · Machine Learning in Healthcare · Cardiac Arrest and Resuscitation