Learning Unknown Service Rates in Queues: A Multi-Armed Bandit Approach

Subhashini Krishnasamy; Rajat Sen; Ramesh Johari; Sanjay Shakkottai

arXiv:1604.06377·cs.SY·November 25, 2019

Learning Unknown Service Rates in Queues: A Multi-Armed Bandit Approach

Subhashini Krishnasamy, Rajat Sen, Ramesh Johari, Sanjay Shakkottai

PDF

TL;DR

This paper investigates learning unknown service rates in queueing systems using multi-armed bandit algorithms, revealing complex regret behaviors and proposing an algorithm that achieves optimal asymptotic queue-regret.

Contribution

It introduces a novel analysis of queue-regret in multi-armed bandit queueing models, showing a transition from logarithmic to inverse-time regret scaling and providing an algorithm that attains this optimal rate.

Findings

01

Queue-regret initially scales logarithmically with time.

02

A transition to an inverse-time regret scaling occurs in the late stage.

03

The proposed algorithm achieves asymptotically optimal queue-regret.

Abstract

Consider a queueing system consisting of multiple servers. Jobs arrive over time and enter a queue for service; the goal is to minimize the size of this queue. At each opportunity for service, at most one server can be chosen, and at most one job can be served. Service is successful with a probability (the service probability) that is a priori unknown for each server. An algorithm that knows the service probabilities (the "genie") can always choose the server of highest service probability. We study algorithms that learn the unknown service probabilities. Our goal is to minimize queue-regret: the (expected) difference between the queue-lengths obtained by the algorithm, and those obtained by the "genie." Since queue-regret cannot be larger than classical regret, results for the standard multi-armed bandit problem give algorithms for which queue-regret increases no more than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.