Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Daniel J. Luckett, Eric B. Laber, Anna R. Kahkoska, David M. Maahs,, Elizabeth Mayer-Davis, Michael R. Kosorok

TL;DR
This paper introduces V-learning, a reinforcement learning approach for estimating personalized, dynamic treatment regimes in mobile health, capable of handling real-time, minute-by-minute decision making for chronic disease management.
Contribution
It develops a new method for estimating optimal dynamic treatment regimes suited for mobile health data with continuous decision points, extending existing approaches.
Findings
Method is consistent and asymptotically normal.
Applied to blood glucose control in type 1 diabetes patients.
Supports indefinite time horizons and high-frequency decision making.
Abstract
The vision for precision medicine is to use individual patient characteristics to inform a personalized treatment plan that leads to the best healthcare possible for each patient. Mobile technologies have an important role to play in this vision as they offer a means to monitor a patient's health status in real-time and subsequently to deliver interventions if, when, and in the dose that they are needed. Dynamic treatment regimes formalize individualized treatment plans as sequences of decision rules, one per stage of clinical intervention, that map current patient information to a recommended treatment. However, existing methods for estimating optimal dynamic treatment regimes are designed for a small number of fixed decision points occurring on a coarse time-scale. We propose a new reinforcement learning method for estimating an optimal treatment regime that is applicable to data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
