Achievable Stability in Redundancy Systems
Youri Raaijmakers, Sem Borst

TL;DR
This paper analyzes the stability limits of parallel server systems with job replication, showing how knowledge of job types influences the optimal replication strategy for maximizing stability.
Contribution
It provides a comprehensive stability analysis for heterogeneous, replicated server systems considering job type observability and workload distribution.
Findings
No replication outperforms replication with multiple servers when job types are known and speeds are NBU.
Full replication improves stability when job types are unknown and speeds are NWU.
Stability regions depend critically on job type observability and workload distribution assumptions.
Abstract
We consider a system with parallel servers where incoming jobs are immediately replicated to, say, servers. Each of the servers has its own queue and follows a FCFS discipline. As soon as the first job replica is completed, the remaining replicas are abandoned. We investigate the achievable stability region for a quite general workload model with different job types and heterogeneous servers, reflecting job-server affinity relations which may arise from data locality issues and soft compatibility constraints. Under the assumption that job types are known beforehand we show for New-Better-than-Used (NBU) distributed speed variations that no replication gives a strictly larger stability region than replication . Strikingly, this does not depend on the underlying distribution of the intrinsic job sizes, but observing the job types is essential for this statement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Distributed systems and fault tolerance · Distributed and Parallel Computing Systems
