Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional   Reasoning Approach

Xubo Lyu; Amin Banitalebi-Dehkordi; Mo Chen; Yong Zhang

arXiv:2203.15925·cs.RO·August 3, 2023

Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach

Xubo Lyu, Amin Banitalebi-Dehkordi, Mo Chen, Yong Zhang

PDF

Open Access

TL;DR

This paper introduces an asynchronous, option-based multi-agent policy gradient method that employs a conditional reasoning approach to handle asynchronous option execution, improving policy learning in complex multi-agent tasks.

Contribution

It proposes a novel conditional reasoning framework for asynchronous option execution in multi-agent policy gradients, enhancing learning efficiency in complex environments.

Findings

01

Effective in complex multi-agent cooperative tasks

02

Handles asynchronous option execution successfully

03

Improves policy gradient estimation in high-dimensional spaces

Abstract

Cooperative multi-agent problems often require coordination between agents, which can be achieved through a centralized policy that considers the global state. Multi-agent policy gradient (MAPG) methods are commonly used to learn such policies, but they are often limited to problems with low-level action spaces. In complex problems with large state and action spaces, it is advantageous to extend MAPG methods to use higher-level actions, also known as options, to improve the policy search efficiency. However, multi-robot option executions are often asynchronous, that is, agents may select and complete their options at different time steps. This makes it difficult for MAPG methods to derive a centralized policy and evaluate its gradient, as centralized policy always select new options at the same time. In this work, we propose a novel, conditional reasoning approach to address this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Auction Theory and Applications