Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare
Nan Fang, Guiliang Liu, Wei Gong

TL;DR
This paper introduces the Constraint Transformer (CT), a novel offline inverse constrained reinforcement learning method that models historical healthcare decisions to improve safety and reduce unsafe behaviors in medical decision-making.
Contribution
The paper proposes the Constraint Transformer, which incorporates historical data and non-Markovian constraints into offline RL for healthcare safety, addressing limitations of existing ICRL methods.
Findings
CT captures unsafe states effectively
Achieves strategies with lower mortality rates
Reduces unsafe decision probabilities
Abstract
Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints. Consequently, Constrained Reinforcement Learning (CRL) is a natural choice for safe decisions. However, specifying the exact cost function is inherently difficult in healthcare. Recent Inverse Constrained Reinforcement Learning (ICRL) is a promising approach that infers constraints from expert demonstrations. ICRL algorithms model Markovian decisions in an interactive environment. These settings do not align with the practical requirement of a decision-making system in healthcare, where decisions rely on historical treatment recorded in an offline dataset. To tackle these issues, we propose the Constraint Transformer (CT). Specifically, 1) we utilize a causal attention mechanism to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuality and Safety in Healthcare · Healthcare Operations and Scheduling Optimization · EEG and Brain-Computer Interfaces
MethodsAttention Is All You Need · Dense Connections · Residual Connection · Position-Wise Feed-Forward Layer · Adam · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings
