Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications
Zequn Chen, Wesley J. Marrero

TL;DR
This paper introduces Boosted Distributional Reinforcement Learning (BDRL), a novel algorithm for healthcare decision-making that models outcome distributions, enforces comparability, and improves treatment consistency and quality-adjusted life years.
Contribution
The paper proposes BDRL, which optimizes outcome distributions for individual agents, enforces comparability, and stabilizes learning through a convex optimization-based projection.
Findings
BDRL improves the number of quality-adjusted life years.
BDRL enhances consistency of healthcare outcomes.
BDRL modifies treatments for median and vulnerable patients.
Abstract
Researchers and practitioners are increasingly considering reinforcement learning to optimize decisions in complex domains like robotics and healthcare. To date, these efforts have largely utilized expectation-based learning. However, relying on expectation-focused objectives may be insufficient for making consistent decisions in highly uncertain situations involving multiple heterogeneous groups. While distributional reinforcement learning algorithms have been introduced to model the full distributions of outcomes, they can yield large discrepancies in realized benefits among comparable agents. This challenge is particularly acute in healthcare settings, where physicians (controllers) must manage multiple patients (subordinate agents) with uncertain disease progression and heterogeneous treatment responses. We propose a Boosted Distributional Reinforcement Learning (BDRL) algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
