RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers (Technical Report)
Hang Zhu, Kostis Kaffes, Zixu Chen, Zhenming Liu, Christos Kozyrakis,, Ion Stoica, Xin Jin

TL;DR
RackSched introduces a novel rack-level microsecond-scale scheduler that combines inter-server and intra-server scheduling, significantly improving throughput and scalability for datacenter applications while maintaining low tail latency.
Contribution
It is the first to design a rack-scale scheduler with network-system co-design, integrating switch-based inter-server scheduling with server-level scheduling for improved performance.
Findings
Up to 1.44x throughput improvement
Near-linear scalability of throughput
Maintains tail latency comparable to a single server
Abstract
Low-latency online services have strict Service Level Objectives (SLOs) that require datacenter systems to support high throughput at microsecond-scale tail latency. Dataplane operating systems have been designed to scale up multi-core servers with minimal overhead for such SLOs. However, as application demands continue to increase, scaling up is not enough, and serving larger demands requires these systems to scale out to multiple servers in a rack. We present RackSched, the first rack-level microsecond-scale scheduler that provides the abstraction of a rack-scale computer (i.e., a huge server with hundreds to thousands of cores) to an external service with network-system co-design. The core of RackSched is a two-layer scheduling framework that integrates inter-server scheduling in the top-of-rack (ToR) switch with intra-server scheduling in each server. We use a combination of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Interconnection Networks and Systems
