# Distributed Power Control for Large Energy Harvesting Networks: A   Multi-Agent Deep Reinforcement Learning Approach

**Authors:** Mohit K.Sharma, Alessio Zappone, Mohamad Assaad, Merouane Debbah,, Spyridon Vassilaras

arXiv: 1904.00601 · 2019-10-23

## TL;DR

This paper introduces a multi-agent deep reinforcement learning framework for online power control in large energy harvesting networks, modeling the problem as a mean-field game and ensuring distributed learning of optimal policies.

## Contribution

It develops a novel distributed MARL approach based on mean-field game theory, with proven convergence to unique stationary solutions for energy harvesting networks.

## Key findings

- Distributed policies perform close to centralized ones.
- Proposed method converges to the unique mean-field equilibrium.
- Centralized DNN policies outperform traditional methods in large networks.

## Abstract

In this paper, we develop a multi-agent reinforcement learning (MARL) framework to obtain online power control policies for a large energy harvesting (EH) multiple access channel, when only causal information about the EH process and wireless channel is available. In the proposed framework, we model the online power control problem as a discrete-time mean-field game (MFG), and analytically show that the MFG has a unique stationary solution. Next, we leverage the fictitious play property of the mean-field games, and the deep reinforcement learning technique to learn the stationary solution of the game, in a completely distributed fashion. We analytically show that the proposed procedure converges to the unique stationary solution of the MFG. This, in turn, ensures that the optimal policies can be learned in a completely distributed fashion. In order to benchmark the performance of the distributed policies, we also develop a deep neural network (DNN) based centralized as well as distributed online power control schemes. Our simulation results show the efficacy of the proposed power control policies. In particular, the DNN based centralized power control policies provide a very good performance for large EH networks for which the design of optimal policies is intractable using the conventional methods such as Markov decision processes. Further, performance of both the distributed policies is close to the throughput achieved by the centralized policies.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.00601/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1904.00601/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/1904.00601/full.md

---
Source: https://tomesphere.com/paper/1904.00601