The association problem in wireless networks: a Policy Gradient Reinforcement Learning approach
Richard Combes, Ilham El Bouloumi, Stephane Senecal, Zwi, Altman

TL;DR
This paper introduces a scalable, stable, and robust self-optimized association algorithm for wireless networks using Policy Gradient Reinforcement Learning, modeled as an MDP, with proven convergence and practical implementation potential.
Contribution
It develops a model-free, robust PGRL-based association algorithm for wireless networks, capable of continuous learning with limited performance degradation.
Findings
Algorithm converges to a local optimum
Average cost decreases monotonically during learning
Suitable for practical 'always-on' deployment
Abstract
The purpose of this paper is to develop a self-optimized association algorithm based on PGRL (Policy Gradient Reinforcement Learning), which is both scalable, stable and robust. The term robust means that performance degradation in the learning phase should be forbidden or limited to predefined thresholds. The algorithm is model-free (as opposed to Value Iteration) and robust (as opposed to Q-Learning). The association problem is modeled as a Markov Decision Process (MDP). The policy space is parameterized. The parameterized family of policies is then used as expert knowledge for the PGRL. The PGRL converges towards a local optimum and the average cost decreases monotonically during the learning process. The properties of the solution make it a good candidate for practical implementation. Furthermore, the robustness property allows to use the PGRL algorithm in an "always-on" learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Advanced Wireless Network Optimization
