Multi-Agent Reinforcement Learning for Intraday Operating Rooms Scheduling under Uncertainty

Kailiang Liu; Ying Chen; Ralf Bornd\"orfer; Thorsten Koch

arXiv:2512.04918·cs.LG·December 5, 2025

Multi-Agent Reinforcement Learning for Intraday Operating Rooms Scheduling under Uncertainty

Kailiang Liu, Ying Chen, Ralf Bornd\"orfer, Thorsten Koch

PDF

Open Access

TL;DR

This paper introduces a multi-agent reinforcement learning framework for real-time intraday operating room scheduling under uncertainty, outperforming heuristics and providing interpretable policies.

Contribution

It formulates OR scheduling as a cooperative Markov game and develops a MARL approach with centralized training and decentralized execution, incorporating rich system states and conflict-free scheduling.

Findings

01

Learned policy outperforms rule-based heuristics across multiple metrics

02

Policy prioritizes emergencies and batches similar cases to reduce setups

03

Quantifies optimality gaps relative to an ex post MIP oracle

Abstract

Intraday surgical scheduling is a multi-objective decision problem under uncertainty-balancing elective throughput, urgent and emergency demand, delays, sequence-dependent setups, and overtime. We formulate the problem as a cooperative Markov game and propose a multi-agent reinforcement learning (MARL) framework in which each operating room (OR) is an agent trained with centralized training and decentralized execution. All agents share a policy trained via Proximal Policy Optimization (PPO), which maps rich system states to actions, while a within-epoch sequential assignment protocol constructs conflict-free joint schedules across ORs. A mixed-integer pre-schedule provides reference starting times for electives; we impose type-specific quadratic delay penalties relative to these references and a terminal overtime penalty, yielding a single reward that captures throughput, timeliness,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealthcare Operations and Scheduling Optimization · Advanced Queuing Theory Analysis · Risk and Portfolio Optimization