Offline Inverse Constrained Reinforcement Learning for Safe-Critical   Decision Making in Healthcare

Nan Fang; Guiliang Liu; Wei Gong

arXiv:2410.07525·cs.LG·October 15, 2024

Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare

Nan Fang, Guiliang Liu, Wei Gong

PDF

Open Access

TL;DR

This paper introduces the Constraint Transformer (CT), a novel offline inverse constrained reinforcement learning method that models historical healthcare decisions to improve safety and reduce unsafe behaviors in medical decision-making.

Contribution

The paper proposes the Constraint Transformer, which incorporates historical data and non-Markovian constraints into offline RL for healthcare safety, addressing limitations of existing ICRL methods.

Findings

01

CT captures unsafe states effectively

02

Achieves strategies with lower mortality rates

03

Reduces unsafe decision probabilities

Abstract

Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints. Consequently, Constrained Reinforcement Learning (CRL) is a natural choice for safe decisions. However, specifying the exact cost function is inherently difficult in healthcare. Recent Inverse Constrained Reinforcement Learning (ICRL) is a promising approach that infers constraints from expert demonstrations. ICRL algorithms model Markovian decisions in an interactive environment. These settings do not align with the practical requirement of a decision-making system in healthcare, where decisions rely on historical treatment recorded in an offline dataset. To tackle these issues, we propose the Constraint Transformer (CT). Specifically, 1) we utilize a causal attention mechanism to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuality and Safety in Healthcare · Healthcare Operations and Scheduling Optimization · EEG and Brain-Computer Interfaces

MethodsAttention Is All You Need · Dense Connections · Residual Connection · Position-Wise Feed-Forward Layer · Adam · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings