Multiagent Rollout Algorithms and Reinforcement Learning

Dimitri Bertsekas

arXiv:1910.00120·cs.LG·April 15, 2020·23 cites

Multiagent Rollout Algorithms and Reinforcement Learning

Dimitri Bertsekas

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multiagent rollout algorithm for dynamic programming that significantly reduces computational complexity while ensuring improved policy performance, and explores extensions for infinite horizon problems.

Contribution

It presents a novel multiagent rollout approach with linear growth in total computation and proves its effectiveness and convergence properties.

Findings

01

Computational complexity per agent is independent of total number of agents.

02

Total computation grows linearly with the number of agents.

03

The algorithm guarantees performance improvement over the base policy.

Abstract

We consider finite and infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. We introduce an approach, whereby at every stage, each agent's decision is made by executing a local rollout algorithm that uses a base policy, together with some coordinating information from the other agents. The amount of local computation required at every stage by each agent is independent of the number of agents, while the amount of total computation (over all agents) grows linearly with the number of agents. By contrast, with the standard rollout algorithm, the amount of total computation grows exponentially with the number of agents. Despite the drastic reduction in required computation, we show that our algorithm has the fundamental cost improvement property of rollout: an improved performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cor3bit/bertsekas-marl
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Optimization and Search Problems