From Simulation to Practice: Generalizable Deep Reinforcement Learning for Cellular Schedulers

Petteri Kela; Bryan Liu; Alvaro Valcarce

arXiv:2411.08529·eess.SP·October 10, 2025

From Simulation to Practice: Generalizable Deep Reinforcement Learning for Cellular Schedulers

Petteri Kela, Bryan Liu, Alvaro Valcarce

PDF

Open Access

TL;DR

This paper develops and enhances deep reinforcement learning algorithms for cellular packet scheduling, achieving real-time, 3GPP-compliant solutions that generalize well across various network configurations and outperform heuristic methods in 5G simulations.

Contribution

It introduces novel training techniques for RL algorithms, including a new variant of SACD, and demonstrates their effectiveness in creating practical, generalizable deep schedulers for 5G networks.

Findings

01

Deep RL schedulers outperform heuristics in 5G simulations.

02

Proposed algorithms generalize across bandwidths and traffic models.

03

Maintained minimal network complexity for real-time deployment.

Abstract

Efficient radio packet scheduling remains one of the most challenging tasks in cellular networks, and while heuristic methods exist, practical deep learning-based schedulers that are 3GPP-compliant and capable of real-time operation in 5G and beyond are still missing. To address this, we first take a critical look at previous deep scheduler efforts. Secondly, we enhance State-of-the-Art (SoTA) deep Reinforcement Learning (RL) algorithms and adapt them to train our deep scheduler. In particular, we propose a novel combination of training techniques for Proximal Policy Optimization (PPO) and a new Distributional Soft Actor-Critic Discrete (DSACD) algorithm, which outperformed other variants tested. These improvements were achieved while maintaining minimal actor network complexity, making them suitable for real-time computing environments. Furthermore, entropy learning in SACD was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Wireless Network Optimization · Wireless Communication Networks Research · Advanced MIMO Systems Optimization