$Q$- and $A$-Learning Methods for Estimating Optimal Dynamic Treatment   Regimes

Phillip J. Schulte; Anastasios A. Tsiatis; Eric B. Laber; Marie; Davidian

arXiv:1202.4177·stat.ME·February 4, 2015

$Q$- and $A$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Phillip J. Schulte, Anastasios A. Tsiatis, Eric B. Laber, Marie, Davidian

PDF

TL;DR

This paper reviews Q- and A-learning methods for estimating optimal dynamic treatment regimes, highlighting their performance and illustrating their application with depression study data.

Contribution

It provides a detailed comparison of Q- and A-learning approaches and demonstrates their practical use in clinical decision-making.

Findings

01

Q- and A-learning effectively estimate optimal treatment regimes

02

Performance varies depending on data and context

03

Application to depression data illustrates real-world utility

Abstract

In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.