Learning Optimal Admission Control in Partially Observable Queueing   Networks

Jonatha Anselmi; Bruno Gaujal; Louis-S\'ebastien Rebuffi

arXiv:2308.02391·cs.LG·August 7, 2023

Learning Optimal Admission Control in Partially Observable Queueing Networks

Jonatha Anselmi, Bruno Gaujal, Louis-S\'ebastien Rebuffi

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning algorithm that efficiently learns optimal admission control policies in partially observable queueing networks, with regret bounds independent of the network's state space size.

Contribution

The paper develops a novel RL algorithm leveraging queueing network structure and Norton’s theorem, achieving regret bounds independent of the MDP diameter.

Findings

01

Regret depends sub-linearly on the maximum number of jobs, S.

02

Regret bounds are independent of the MDP diameter, unlike previous analyses.

03

The approach effectively handles partial observability in queueing networks.

Abstract

We present an efficient reinforcement learning algorithm that learns the optimal admission control policy in a partially observable queueing network. Specifically, only the arrival and departure times from the network are observable, and optimality refers to the average holding/rejection cost in infinite horizon. While reinforcement learning in Partially Observable Markov Decision Processes (POMDP) is prohibitively expensive in general, we show that our algorithm has a regret that only depends sub-linearly on the maximal number of jobs in the network, $S$ . In particular, in contrast with existing regret analyses, our regret bound does not depend on the diameter of the underlying Markov Decision Process (MDP), which in most queueing systems is at least exponential in $S$ . The novelty of our approach is to leverage Norton's equivalent theorem for closed product-form queueing networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Advanced Queuing Theory Analysis · Transportation and Mobility Innovations