Macro-Action-Based Multi-Agent/Robot Deep Reinforcement Learning under   Partial Observability

Yuchen Xiao

arXiv:2209.10003·cs.AI·October 12, 2022

Macro-Action-Based Multi-Agent/Robot Deep Reinforcement Learning under Partial Observability

Yuchen Xiao

PDF

Open Access

TL;DR

This paper introduces macro-action-based deep reinforcement learning methods for multi-agent systems operating under partial observability, enabling asynchronous decision-making and improving scalability in complex real-world tasks.

Contribution

It develops value-based and policy gradient RL algorithms for MacDec-POMDPs, allowing asynchronous macro-action decision-making in multi-agent reinforcement learning.

Findings

01

Algorithms outperform existing methods in large multi-agent problems

02

Effective in both simulation and real robot experiments

03

Demonstrates scalability and high-quality solutions with macro-actions

Abstract

The state-of-the-art multi-agent reinforcement learning (MARL) methods have provided promising solutions to a variety of complex problems. Yet, these methods all assume that agents perform synchronized primitive-action executions so that they are not genuinely scalable to long-horizon real-world multi-agent/robot tasks that inherently require agents/robots to asynchronously reason about high-level action selection at varying time durations. The Macro-Action Decentralized Partially Observable Markov Decision Process (MacDec-POMDP) is a general formalization for asynchronous decision-making under uncertainty in fully cooperative multi-agent tasks. In this thesis, we first propose a group of value-based RL approaches for MacDec-POMDPs, where agents are allowed to perform asynchronous learning and decision-making with macro-action-value functions in three paradigms: decentralized learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics