Counterfactual Q Learning via the Linear Buckley James Method for Longitudinal Survival Data
Jeongjin Lee, Jong-Min Kim

TL;DR
This paper presents a novel reinforcement learning framework that combines the Buckley-James method with Q-learning to effectively estimate optimal treatment strategies in censored survival data, improving personalized healthcare decisions.
Contribution
It introduces the Counterfactual Buckley-James Q-Learning method, integrating imputation of censored data with reinforcement learning for dynamic treatment optimization.
Findings
Robust performance across various censoring scenarios in simulations
Effective estimation of optimal treatment regimes in clinical data
Provides interpretable and reliable survival outcome predictions
Abstract
Treatment strategies are critical in healthcare, particularly when outcomes are subject to censoring. This study introduces the Counterfactual Buckley-James Q-Learning framework, which integrates the Buckley-James method with reinforcement learning to address challenges posed by censored survival data. The Buckley-James method imputes censored survival times via conditional expectations based on observed data, offering a robust mechanism for handling incomplete outcomes. By incorporating these imputed values into a counterfactual Q-learning framework, the proposed method enables the estimation and comparison of potential outcomes under different treatment strategies. This facilitates the identification of optimal dynamic treatment regimes that maximize expected survival time. Through extensive simulation studies, the method demonstrates robust performance across various sample sizes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
