Average Reward Reinforcement Learning for Wireless Radio Resource Management
Kun Yang, Jing Yang, Cong Shen

TL;DR
This paper identifies a mismatch in reinforcement learning formulations for wireless radio resource management and proposes an average reward RL approach, introducing ARO SAC, which improves system performance by 15% over traditional methods.
Contribution
The paper systematically investigates the discrepancy between discounted and undiscounted RL in RRM and introduces ARO SAC, a novel average reward RL algorithm tailored for wireless networks.
Findings
15% performance gain over traditional RL methods
Demonstrates the effectiveness of average reward RL in wireless RRM
Quantifies the gap between discounted and average reward formulations
Abstract
In this paper, we address a crucial but often overlooked issue in applying reinforcement learning (RL) to radio resource management (RRM) in wireless communications: the mismatch between the discounted reward RL formulation and the undiscounted goal of wireless network optimization. To the best of our knowledge, we are the first to systematically investigate this discrepancy, starting with a discussion of the problem formulation followed by simulations that quantify the extent of the gap. To bridge this gap, we introduce the use of average reward RL, a method that aligns more closely with the long-term objectives of RRM. We propose a new method called the Average Reward Off policy Soft Actor Critic (ARO SAC) is an adaptation of the well known Soft Actor Critic algorithm in the average reward framework. This new method achieves significant performance improvement our simulation results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Experience Replay · Dense Connections · Soft Actor Critic
