Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints
Markus Fidler, Brenton Walker, Yuming Jiang

TL;DR
This paper develops non-asymptotic delay bounds for multi-server fork-join systems with synchronization constraints, providing insights into their performance and comparing different configurations through theoretical analysis and real-system validation.
Contribution
It introduces a max-plus server model to derive explicit delay bounds for complex fork-join networks with multiple stages and servers, advancing understanding of their performance limits.
Findings
Delay bounds grow as O(h ln k) with stages and servers
Non-idling single-queue systems outperform traditional models
Theoretical bounds align well with simulations and Spark system data
Abstract
Multi-server systems have received increasing attention with important implementations such as Google MapReduce, Hadoop, and Spark. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a later join operation, where completed tasks wait until the results of all tasks of a job can be combined and the job leaves the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging and few explicit results are known. In this work, we model fork-join systems using a max-plus server model that enables us to derive statistical bounds on waiting and sojourn times for general arrival and service time processes. We contribute end-to-end delay bounds for multi-stage fork-join networks that grow in for fork-join stages, each with parallel servers. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Distributed systems and fault tolerance · Interconnection Networks and Systems
