Online Demand Scheduling with Failovers
Konstantina Mellou, Marco Molinaro, and Rudy Zhou

TL;DR
This paper addresses the problem of optimally deploying hardware in cloud data centers under power and robustness constraints, introducing the Online Demand Scheduling with Failover problem and proposing algorithms for worst-case and stochastic demand models.
Contribution
It formulates a new online assignment problem with failover constraints and provides algorithms with provable guarantees for both worst-case and stochastic demand scenarios.
Findings
Deterministic algorithm achieves approximately 50% competitiveness in worst-case.
Stochastic model algorithms attain sub-linear regret, approaching optimal utilization as demands grow.
New techniques include a configuration LP and an online monotone matching procedure.
Abstract
Motivated by cloud computing applications, we study the problem of how to optimally deploy new hardware subject to both power and robustness constraints. To model the situation observed in large-scale data centers, we introduce the Online Demand Scheduling with Failover problem. There are identical devices with capacity constraints. Demands come one-by-one and, to be robust against a device failure, need to be assigned to a pair of devices. When a device fails (in a failover scenario), each demand assigned to it is rerouted to its paired device (which may now run at increased capacity). The goal is to assign demands to the devices to maximize the total utilization subject to both the normal capacity constraints as well as these novel failover constraints. These latter constraints introduce new decision tradeoffs not present in classic assignment problems such as the Multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
