TL;DR
This paper proposes load balancing policies in large-scale systems that utilize attained service time information alongside queue length to improve job response times, especially under high variability in job sizes.
Contribution
It introduces and analyzes novel load balancing policies that incorporate attained service time, demonstrating their effectiveness through theoretical analysis and simulations.
Findings
Significant reduction in waiting times using attained service time information.
Policies using only attained service time outperform traditional queue length-based methods.
Improved response times in moderately loaded systems without queue length data.
Abstract
Our interest lies in load balancing jobs in large scale systems consisting of multiple dispatchers and FCFS servers. In the absence of any information on job sizes, dispatchers typically use queue length information reported by the servers to assign incoming jobs. When job sizes are highly variable, using only queue length information is clearly suboptimal and performance can be improved if some indication can be provided to the dispatcher about the size of an ongoing job. In a FCFS server measuring the attained service time of the ongoing job is easy and servers can therefore report this attained service time together with the queue length when queried by a dispatcher. In this paper we propose and analyse a variety of load balancing policies that exploit both the queue length and attained service time to assign jobs, as well as policies for which only the attained service time of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
