Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm
Yooseok Lim, Inbeom Park, Sujee Lee

TL;DR
This paper develops and validates a reinforcement learning-based personalized heparin dosing policy for ICU patients, aiming to improve safety and efficacy in medication administration using extensive clinical data.
Contribution
It introduces a batch-constrained offline RL approach for personalized heparin dosing, effectively integrating clinician policies and minimizing out-of-distribution errors.
Findings
The RL policy reliably guides dosing within therapeutic ranges.
Quantitative evaluation shows improved dosing accuracy.
Qualitative analysis reveals meaningful state-Q-value relationships.
Abstract
Appropriate medication dosages in the intensive care unit (ICU) are critical for patient survival. Heparin, used to treat thrombosis and inhibit blood clotting in the ICU, requires careful administration due to its complexity and sensitivity to various factors, including patient clinical characteristics, underlying medical conditions, and potential drug interactions. Incorrect dosing can lead to severe complications such as strokes or excessive bleeding. To address these challenges, this study proposes a reinforcement learning (RL)-based personalized optimal heparin dosing policy that guides dosing decisions reliably within the therapeutic range based on individual patient conditions. A batch-constrained policy was implemented to minimize out-of-distribution errors in an offline RL environment and effectively integrate RL with existing clinician policies. The policy's effectiveness was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHeparin-Induced Thrombocytopenia and Thrombosis · Atrial Fibrillation Management and Outcomes
