Multi-Agent Reinforcement Learning for Intraday Operating Rooms Scheduling under Uncertainty
Kailiang Liu, Ying Chen, Ralf Bornd\"orfer, Thorsten Koch

TL;DR
This paper introduces a multi-agent reinforcement learning framework for real-time intraday operating room scheduling under uncertainty, outperforming heuristics and providing interpretable policies.
Contribution
It formulates OR scheduling as a cooperative Markov game and develops a MARL approach with centralized training and decentralized execution, incorporating rich system states and conflict-free scheduling.
Findings
Learned policy outperforms rule-based heuristics across multiple metrics
Policy prioritizes emergencies and batches similar cases to reduce setups
Quantifies optimality gaps relative to an ex post MIP oracle
Abstract
Intraday surgical scheduling is a multi-objective decision problem under uncertainty-balancing elective throughput, urgent and emergency demand, delays, sequence-dependent setups, and overtime. We formulate the problem as a cooperative Markov game and propose a multi-agent reinforcement learning (MARL) framework in which each operating room (OR) is an agent trained with centralized training and decentralized execution. All agents share a policy trained via Proximal Policy Optimization (PPO), which maps rich system states to actions, while a within-epoch sequential assignment protocol constructs conflict-free joint schedules across ORs. A mixed-integer pre-schedule provides reference starting times for electives; we impose type-specific quadratic delay penalties relative to these references and a terminal overtime penalty, yielding a single reward that captures throughput, timeliness,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare Operations and Scheduling Optimization · Advanced Queuing Theory Analysis · Risk and Portfolio Optimization
