Multi-Provider NFV Network Service Delegation via Average Reward Reinforcement Learning
Bahador Bakhshi, Josep Mangues-Bafalluy, Jorge Baranda

TL;DR
This paper introduces an average reward reinforcement learning method for service delegation in multi-provider 5G/6G networks, outperforming traditional Q-learning and greedy policies in maximizing profit.
Contribution
It proposes a novel R-Learning approach for NFV service delegation, addressing unknown request dynamics and outperforming existing reinforcement learning methods.
Findings
R-Learning outperforms Q-Learning and greedy policies.
Achieves up to 9% optimality gap in simulations.
Competitively matches the optimal MDP solution experimentally.
Abstract
In multi-provider 5G/6G networks, service delegation enables administrative domains to federate in provisioning NFV network services. Admission control is fundamental in selecting the appropriate deployment domain to maximize average profit without prior knowledge of service requests' statistical distributions. This paper analyzes a general federation contract model for service delegation in various ways. First, under the assumption of known system dynamics, we obtain the theoretically optimal performance bound by formulating the admission control problem as an infinite-horizon Markov decision process (MDP) and solving it through dynamic programming. Second, we apply reinforcement learning to practically tackle the problem when the arrival and departure rates are not known. As Q-learning maximizes the discounted rewards, we prove it is not an efficient solution due to its sensitivity to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced MIMO Systems Optimization · Software-Defined Networks and 5G · Age of Information Optimization
