Learning to Schedule in Parallel-Server Queues with Stochastic Bilinear Rewards
Jung-hun Kim, Milan Vojnovic

TL;DR
This paper introduces a scheduling algorithm for multi-class queueing systems with uncertain bilinear rewards, balancing reward maximization and queue stability, and demonstrates its effectiveness through theoretical guarantees and experiments.
Contribution
It proposes a novel bandit-based scheduling algorithm for bilinear reward models in queueing systems, ensuring sub-linear regret and stability.
Findings
Achieves sub-linear regret of ((}((T)))
Guarantees queue stability with bounded mean holding costs
Demonstrates efficiency through numerical experiments
Abstract
We consider the problem of scheduling in multi-class, parallel-server queuing systems with uncertain rewards from job-server assignments. In this scenario, jobs incur holding costs while awaiting completion, and job-server assignments yield observable stochastic rewards with unknown mean values. The mean rewards for job-server assignments are assumed to follow a bilinear model with respect to features that characterize jobs and servers. Our objective is to minimize regret by maximizing the cumulative reward of job-server assignments over a time horizon, while keeping the total job holding cost bounded to ensure the stability of the queueing system. This problem is motivated by applications requiring resource allocation in network systems. In this problem, it is essential to control the tradeoff between reward maximization and fair allocation for the stability of the underlying queuing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Advanced Queuing Theory Analysis
