
TL;DR
This paper analyzes how splitting a token bucket into multiple sub-buckets in distributed systems increases overall job latency, regardless of the job arrival process or distribution method.
Contribution
It proves that dividing a token bucket into multiple sub-buckets always results in higher total and average job latencies, independent of job distribution or arrival patterns.
Findings
Splitting token buckets increases total job latency.
Independent of job distribution, latency always rises with splitting.
Average job latency is higher when token buckets are divided.
Abstract
This note is concerned with the impact on job latency of splitting a token bucket into multiple sub-token buckets with equal aggregate parameters and offered the same job arrival process. The situation commonly arises in distributed computing environments where job arrivals are rate controlled (each job needs one token to enter the system), but capacity limitations call for distributing jobs across multiple compute resources with scalability considerations preventing the use of a centralized rate control component (each compute resource is responsible for monitoring and enforcing that the job stream it receives conforms to a certain traffic envelope). The question we address is to what extent splitting a token bucket into multiple sub-token buckets that individually rate control a subset of the original arrival process affects job latency, when jobs wait for a token whenever the token…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Traffic and Congestion Control · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
