Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained   Scheduling

Zhuoran Li; Ruishuo Chen; Hai Zhong; Longbo Huang

arXiv:2501.12942·cs.AI·January 23, 2025

Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained Scheduling

Zhuoran Li, Ruishuo Chen, Hai Zhong, Longbo Huang

PDF

Open Access

TL;DR

This paper introduces SOCD, an offline reinforcement learning algorithm with diffusion-based policy networks and critic guidance, enabling effective multi-user delay-constrained scheduling without online system interaction.

Contribution

The paper presents a novel offline RL approach with diffusion policies and critic guidance for delay-constrained scheduling, avoiding online data collection and handling complex system dynamics.

Findings

01

SOCD outperforms existing methods in various environments.

02

It is robust to system dynamics and partial observability.

03

Eliminates need for online system interaction during training.

Abstract

Effective multi-user delay-constrained scheduling is crucial in various real-world applications, such as instant messaging, live streaming, and data center management. In these scenarios, schedulers must make real-time decisions to satisfy both delay and resource constraints without prior knowledge of system dynamics, which are often time-varying and challenging to estimate. Current learning-based methods typically require interactions with actual systems during the training stage, which can be difficult or impractical, as it is capable of significantly degrading system performance and incurring substantial service costs. To address these challenges, we propose a novel offline reinforcement learning-based algorithm, named \underline{S}cheduling By \underline{O}ffline Learning with \underline{C}ritic Guidance and \underline{D}iffusion Generation (SOCD), to learn efficient scheduling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Wireless Network Optimization · Age of Information Optimization · Transportation and Mobility Innovations

Methodstravel james