Deep Reinforcement Learning with Symmetric Prior for Predictive Power   Allocation to Mobile Users

Jianyu Zhao; Chenyang Yang

arXiv:2103.13298·cs.NI·March 25, 2021

Deep Reinforcement Learning with Symmetric Prior for Predictive Power Allocation to Mobile Users

Jianyu Zhao, Chenyang Yang

PDF

Open Access

TL;DR

This paper introduces a symmetric prior in deep reinforcement learning for power allocation in mobile video streaming, significantly reducing training complexity and model size while maintaining performance.

Contribution

It proposes a symmetric prior-based neural network design for DDPG, reducing sampling complexity and model size in wireless resource allocation tasks.

Findings

01

Model parameters compressed by 2/K^2

02

Training episodes reduced by about one third for K=10

03

Maintains performance comparable to vanilla DDPG

Abstract

Deep reinforcement learning has been applied for a variety of wireless tasks, which is however known with high training and inference complexity. In this paper, we resort to deep deterministic policy gradient (DDPG) algorithm to optimize predictive power allocation among K mobile users requesting video streaming, which minimizes the energy consumption of the network under the no-stalling constraint of each user. To reduce the sampling complexity and model size of the DDPG, we exploit a kind of symmetric prior inherent in the actor and critic networks: permutation invariant and equivariant properties, to design the neural networks. Our analysis shows that the free model parameters of the DDPG can be compressed by 2/K^2. Simulation results demonstrate that the episodes required by the learning model with the symmetric prior to achieve the same performance as the vanilla policy reduces by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGreen IT and Sustainability · Advanced MIMO Systems Optimization · Energy Harvesting in Wireless Networks

MethodsWeight Decay · Adam · Dense Connections · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Batch Normalization · Deep Deterministic Policy Gradient