Reinforcement Learning enhanced Online Adaptive Clinical Decision Support via Digital Twin powered Policy and Treatment Effect optimized Reward
Xinyu Qin, Ruiheng Yu, Lu Wang

TL;DR
This paper introduces an online adaptive clinical decision support system that combines reinforcement learning, digital twins, and safety constraints to improve treatment policies with minimal expert intervention.
Contribution
It presents a novel framework integrating digital twins, uncertainty estimation, and safety rules for real-time adaptive clinical decision support.
Findings
Low latency and stable throughput in simulations
Reduced expert queries while maintaining safety
Improved treatment policy performance over baselines
Abstract
Clinical decision support must adapt online under safety constraints. We present an online adaptive tool where reinforcement learning provides the policy, a patient digital twin provides the environment, and treatment effect defines the reward. The system initializes a batch-constrained policy from retrospective data and then runs a streaming loop that selects actions, checks safety, and queries experts only when uncertainty is high. Uncertainty comes from a compact ensemble of five Q-networks via the coefficient of variation of action values with a compression. The digital twin updates the patient state with a bounded residual rule. The outcome model estimates immediate clinical effect, and the reward is the treatment effect relative to a conservative reference with a fixed z-score normalization from the training split. Online updates operate on recent data with short runs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
