# Voting-Based Multi-Agent Reinforcement Learning for Intelligent IoT

**Authors:** Yue Xu, Zengde Deng, Mengdi Wang, Wenjun Xu, Anthony Man-Cho So,, Shuguang Cui

arXiv: 1907.01385 · 2020-09-01

## TL;DR

This paper introduces a voting-based multi-agent reinforcement learning framework for IoT systems, utilizing a distributed primal-dual algorithm to achieve efficient, consensus-driven decision making with proven convergence.

## Contribution

It formulates a novel voting-based MARL approach for IoT, proposing a distributed primal-dual algorithm that guarantees convergence and efficiency comparable to centralized methods.

## Key findings

- The proposed algorithm converges sublinearly in simulations.
- Distributed learning matches centralized convergence rates.
- Case studies demonstrate practical effectiveness in IoT systems.

## Abstract

The recent success of single-agent reinforcement learning (RL) in Internet of things (IoT) systems motivates the study of multi-agent reinforcement learning (MARL), which is more challenging but more useful in large-scale IoT. In this paper, we consider a voting-based MARL problem, in which the agents vote to make group decisions and the goal is to maximize the globally averaged returns. To this end, we formulate the MARL problem based on the linear programming form of the policy optimization problem and propose a distributed primal-dual algorithm to obtain the optimal solution. We also propose a voting mechanism through which the distributed learning achieves the same sublinear convergence rate as centralized learning. In other words, the distributed decision making does not slow down the process of achieving global consensus on optimality. Lastly, we verify the convergence of our proposed algorithm with numerical simulations and conduct case studies in practical multi-agent IoT systems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.01385/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1907.01385/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/1907.01385/full.md

---
Source: https://tomesphere.com/paper/1907.01385