A Model of Job Parallelism for Latency Reduction in Large-Scale Systems
Ayalvadi Ganesh, Arpan Mukhopadhyay

TL;DR
This paper models job parallelism in large-scale systems, showing that parallel processing with multiple servers significantly reduces average delay, approaching exponential improvements as system load increases.
Contribution
It introduces an idealized model for job parallelism, providing mean-field analysis and progress towards rigorous justification of delay reduction due to parallel processing.
Findings
Average server occupancy scales logarithmically with load for d≥2
Parallelism exponentially reduces job response time
Mean-field analysis aligns with simulation results
Abstract
Processing computation-intensive jobs at multiple processing cores in parallel is essential in many real-world applications. In this paper, we consider an idealised model for job parallelism in which a job can be served simultaneously by distinct servers. The job is considered complete when the total amount of work done on it by the servers equals its size. We study the effect of parallelism on the average delay of jobs. Specifically, we analyze a system consisting of parallel processor sharing servers in which jobs arrive according to a Poisson process of rate () and each job brings an exponentially distributed amount of work with unit mean. Upon arrival, a job selects servers uniformly at random and joins all the chosen servers simultaneously. We show by a mean-field analysis that, for fixed and large , the average occupancy of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
