Differentiable Discrete Event Simulation for Queuing Network Control

Ethan Che; Jing Dong; Hongseok Namkoong

arXiv:2409.03740·cs.LG·September 6, 2024

Differentiable Discrete Event Simulation for Queuing Network Control

Ethan Che, Jing Dong, Hongseok Namkoong

PDF

Open Access

TL;DR

This paper introduces a scalable, differentiable simulation framework for queueing network control that significantly improves policy training efficiency and stability using pathwise gradients and neural network architectures.

Contribution

It presents a novel smoothing technique for discrete event dynamics enabling pathwise gradient computation in large-scale queueing networks.

Findings

01

Policy gradient estimators are several orders more accurate than REINFORCE.

02

Training with pathwise gradients yields 50-1000x better sample efficiency.

03

The proposed methods handle non-stationary and non-exponential systems effectively.

Abstract

Queuing network control is essential for managing congestion in job-processing systems such as service systems, communication networks, and manufacturing processes. Despite growing interest in applying reinforcement learning (RL) techniques, queueing network control poses distinct challenges, including high stochasticity, large state and action spaces, and lack of stability. To tackle these challenges, we propose a scalable framework for policy optimization based on differentiable discrete event simulation. Our main insight is that by implementing a well-designed smoothing technique for discrete event dynamics, we can compute pathwise policy gradients for large-scale queueing networks using auto-differentiation software (e.g., Tensorflow, PyTorch) and GPU parallelization. Through extensive empirical experiments, we observe that our policy gradient estimators are several orders of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Software System Performance and Reliability · Cloud Computing and Resource Management

Methodstravel james