Off-policy Learning for Remote Electrical Tilt Optimization

Filippo Vannella; Jaeseong Jeong; Alexandre Proutiere

arXiv:2005.10577·cs.LG·May 22, 2020

Off-policy Learning for Remote Electrical Tilt Optimization

Filippo Vannella, Jaeseong Jeong, Alexandre Proutiere

PDF

TL;DR

This paper presents an offline off-policy CMAB approach to optimize remote electrical tilt in cellular networks, improving QoS without risky online exploration or simulation gaps.

Contribution

It introduces a novel off-policy CMAB framework for RET optimization, enabling policy learning from real network data without online exploration.

Findings

01

Policies outperform the rule-based logging policy

02

Consistent improvements in network KPIs

03

Effective offline policy learning from real data

Abstract

We address the problem of Remote Electrical Tilt (RET) optimization using off-policy Contextual Multi-Armed-Bandit (CMAB) techniques. The goal in RET optimization is to control the orientation of the vertical tilt angle of the antenna to optimize Key Performance Indicators (KPIs) representing the Quality of Service (QoS) perceived by the users in cellular networks. Learning an improved tilt update policy is hard. On the one hand, coming up with a new policy in an online manner in a real network requires exploring tilt updates that have never been used before, and is operationally too risky. On the other hand, devising this policy via simulations suffers from the simulation-to-reality gap. In this paper, we circumvent these issues by learning an improved policy in an offline manner using existing data collected on real networks. We formulate the problem of devising such a policy using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.