The association problem in wireless networks: a Policy Gradient   Reinforcement Learning approach

Richard Combes; Ilham El Bouloumi; Stephane Senecal; Zwi; Altman

arXiv:1306.2554·cs.NI·June 12, 2013·2 cites

The association problem in wireless networks: a Policy Gradient Reinforcement Learning approach

Richard Combes, Ilham El Bouloumi, Stephane Senecal, Zwi, Altman

PDF

Open Access

TL;DR

This paper introduces a scalable, stable, and robust self-optimized association algorithm for wireless networks using Policy Gradient Reinforcement Learning, modeled as an MDP, with proven convergence and practical implementation potential.

Contribution

It develops a model-free, robust PGRL-based association algorithm for wireless networks, capable of continuous learning with limited performance degradation.

Findings

01

Algorithm converges to a local optimum

02

Average cost decreases monotonically during learning

03

Suitable for practical 'always-on' deployment

Abstract

The purpose of this paper is to develop a self-optimized association algorithm based on PGRL (Policy Gradient Reinforcement Learning), which is both scalable, stable and robust. The term robust means that performance degradation in the learning phase should be forbidden or limited to predefined thresholds. The algorithm is model-free (as opposed to Value Iteration) and robust (as opposed to Q-Learning). The association problem is modeled as a Markov Decision Process (MDP). The policy space is parameterized. The parameterized family of policies is then used as expert knowledge for the PGRL. The PGRL converges towards a local optimum and the average cost decreases monotonically during the learning process. The properties of the solution make it a good candidate for practical implementation. Furthermore, the robustness property allows to use the PGRL algorithm in an "always-on" learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Advanced Wireless Network Optimization