Dynamics of Resource Allocation in O-RANs: An In-depth Exploration of   On-Policy and Off-Policy Deep Reinforcement Learning for Real-Time   Applications

Manal Mehdaoui; Amine Abouaomar

arXiv:2412.01839·cs.NI·December 4, 2024

Dynamics of Resource Allocation in O-RANs: An In-depth Exploration of On-Policy and Off-Policy Deep Reinforcement Learning for Real-Time Applications

Manal Mehdaoui, Amine Abouaomar

PDF

Open Access

TL;DR

This paper compares on-policy and off-policy deep reinforcement learning models, PPO and ACER, for resource allocation in O-RAN, validating their performance and insights for latency-sensitive applications.

Contribution

It provides a replication study validating the effectiveness of PPO and ACER in O-RAN resource management, emphasizing their performance differences and practical implications.

Findings

01

Both models outperform greedy algorithms in O-RAN.

02

PPO balances energy use and latency effectively.

03

ACER converges faster in resource allocation tasks.

Abstract

Deep Reinforcement Learning (DRL) is a powerful tool used for addressing complex challenges in mobile networks. This paper investigates the application of two DRL models, on-policy and off-policy, in the field of resource allocation for Open Radio Access Networks (O-RAN). The on-policy model is the Proximal Policy Optimization (PPO), and the off-policy model is the Sample Efficient Actor-Critic with Experience Replay (ACER), which focuses on resolving the challenges of resource allocation associated with a Quality of Service (QoS) application that has strict requirements. Motivated by the original work of Nessrine Hammami and Kim Khoa Nguyen, this study is a replication to validate and prove the findings. Both PPO and ACER are used within the same experimental setup to assess their performance in a scenario of latency-sensitive and latency-tolerant users and compare them. The aim is to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Energy Efficient Wireless Sensor Networks · Energy Harvesting in Wireless Networks

Methods*Communicated@Fast*How Do I Communicate to Expedia? · travel james · Retrace · Softmax · Convolution · Experience Replay · Entropy Regularization · Dense Connections · Proximal Policy Optimization · Trust Region Policy Optimization