Post-Decision State-Based Online Learning for Delay-Energy-Aware Flow Allocation in Wireless Systems

Mahesh Ganesh Bhat; Shana Moothedath; and Prasanna Chaporkar

arXiv:2601.03108·eess.SP·January 7, 2026

Post-Decision State-Based Online Learning for Delay-Energy-Aware Flow Allocation in Wireless Systems

Mahesh Ganesh Bhat, Shana Moothedath, and Prasanna Chaporkar

PDF

Open Access

TL;DR

This paper presents a structure-aware reinforcement learning method using post-decision states for efficient delay- and energy-aware flow allocation in 5G wireless systems, improving convergence speed and resource management.

Contribution

It introduces a PDS-based value iteration algorithm that leverages MDP structure to enhance learning efficiency without prior knowledge of system dynamics.

Findings

01

Faster convergence compared to standard Q-learning

02

Lower long-term cost in resource allocation

03

Effective in heterogeneous 5G UPFs

Abstract

We develop a structure-aware reinforcement learning (RL) approach for delay- and energy-aware flow allocation in 5G User Plane Functions (UPFs). We consider a dynamic system with $K$ heterogeneous UPFs of varying capacities that handle stochastic arrivals of $M$ flow types, each with distinct rate requirements. We model the system as a Markov decision process (MDP) to capture the stochastic nature of flow arrivals and departures (possibly unknown), as well as the impact of flow allocation in the system. To solve this problem, we propose a post-decision state (PDS) based value iteration algorithm that exploits the underlying structure of the MDP. By separating action-controlled dynamics from exogenous factors, PDS enables faster convergence and efficient adaptive flow allocation, even in the absence of statistical knowledge about exogenous variables. Simulation results demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced MIMO Systems Optimization · Wireless Networks and Protocols · Advanced Wireless Network Optimization