TL;DR
This paper introduces a novel framework for learning individualized treatment rules from observational data by framing it as a contextual bandit problem, demonstrating improved treatment policies over physicians and baselines in both simulations and real-world data.
Contribution
It proposes a new method for deriving personalized treatment policies using estimated translated inverse propensity scores within a contextual bandit framework.
Findings
The proposed method outperforms baseline treatment prediction models.
It achieves better treatment policies than physicians in real-world data.
The framework is validated through simulation and clinical data analysis.
Abstract
Randomized controlled trials typically analyze the effectiveness of treatments with the goal of making treatment recommendations for patient subgroups. With the advance of electronic health records, a great variety of data has been collected in clinical practice, enabling the evaluation of treatments and treatment policies based on observational data. In this paper, we focus on learning individualized treatment rules (ITRs) to derive a treatment policy that is expected to generate a better outcome for an individual patient. In our framework, we cast ITRs learning as a contextual bandit problem and minimize the expected risk of the treatment policy. We conduct experiments with the proposed framework both in a simulation study and based on a real-world dataset. In the latter case, we apply our proposed method to learn the optimal ITRs for the administration of intravenous (IV) fluids and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
