Average Reward Reinforcement Learning for Wireless Radio Resource   Management

Kun Yang; Jing Yang; Cong Shen

arXiv:2501.06700·cs.IT·January 14, 2025

Average Reward Reinforcement Learning for Wireless Radio Resource Management

Kun Yang, Jing Yang, Cong Shen

PDF

TL;DR

This paper identifies a mismatch in reinforcement learning formulations for wireless radio resource management and proposes an average reward RL approach, introducing ARO SAC, which improves system performance by 15% over traditional methods.

Contribution

The paper systematically investigates the discrepancy between discounted and undiscounted RL in RRM and introduces ARO SAC, a novel average reward RL algorithm tailored for wireless networks.

Findings

01

15% performance gain over traditional RL methods

02

Demonstrates the effectiveness of average reward RL in wireless RRM

03

Quantifies the gap between discounted and average reward formulations

Abstract

In this paper, we address a crucial but often overlooked issue in applying reinforcement learning (RL) to radio resource management (RRM) in wireless communications: the mismatch between the discounted reward RL formulation and the undiscounted goal of wireless network optimization. To the best of our knowledge, we are the first to systematically investigate this discrepancy, starting with a discussion of the problem formulation followed by simulations that quantify the extent of the gap. To bridge this gap, we introduce the use of average reward RL, a method that aligns more closely with the long-term objectives of RRM. We propose a new method called the Average Reward Off policy Soft Actor Critic (ARO SAC) is an adaptation of the well known Soft Actor Critic algorithm in the average reward framework. This new method achieves significant performance improvement our simulation results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Experience Replay · Dense Connections · Soft Actor Critic