TL;DR
This paper introduces a deep reinforcement learning approach using PPO to dynamically allocate resources for URLLC traffic over a fully occupied eMBB grid, ensuring low latency and minimal outage.
Contribution
It presents a novel DRL-based resource slicing method that effectively manages URLLC and eMBB coexistence by puncturing eMBB codewords without violating latency.
Findings
DRL policy maintains URLLC latency requirements
Minimizes eMBB codeword outages compared to existing schemes
Uses PPO for dynamic and efficient resource allocation
Abstract
With the advent of 5G and the research into beyond 5G (B5G) networks, a novel and very relevant research issue is how to manage the coexistence of different types of traffic, each with very stringent but completely different requirements. In this paper we propose a deep reinforcement learning (DRL) algorithm to slice the available physical layer resources between ultra-reliable low-latency communications (URLLC) and enhanced Mobile BroadBand (eMBB) traffic. Specifically, in our setting the time-frequency resource grid is fully occupied by eMBB traffic and we train the DRL agent to employ proximal policy optimization (PPO), a state-of-the-art DRL algorithm, to dynamically allocate the incoming URLLC traffic by puncturing eMBB codewords. Assuming that each eMBB codeword can tolerate a certain limited amount of puncturing beyond which is in outage, we show that the policy devised by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
