Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under   Partial Observability

Shayegan Omidshafiei; Jason Pazis; Christopher Amato; Jonathan P. How,; John Vian

arXiv:1703.06182·cs.LG·May 23, 2018·188 cites

Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability

Shayegan Omidshafiei, Jason Pazis, Christopher Amato, Jonathan P. How,, John Vian

PDF

Open Access

TL;DR

This paper addresses the challenge of multi-task multi-agent reinforcement learning under partial observability by proposing a decentralized approach and a policy distillation method that handles multiple tasks without explicit task identities.

Contribution

It introduces a decentralized single-task learning framework and a policy distillation technique for multi-task multi-agent RL under partial observability, overcoming limitations of task-specific policies.

Findings

01

Decentralized learning is robust to non-stationary multi-agent interactions.

02

Policy distillation enables multi-task performance without explicit task identities.

03

The approach improves scalability and applicability in real-world multi-agent systems.

Abstract

Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Domain Adaptation and Few-Shot Learning