SIDE: State Inference for Partially Observable Cooperative Multi-Agent   Reinforcement Learning

Zhiwei Xu; Yunpeng Bai; Dapeng Li; Bin Zhang; Guoliang Fan

arXiv:2105.06228·cs.MA·December 21, 2021·1 cites

SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning

Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, Guoliang Fan

PDF

Open Access

TL;DR

SIDE introduces a novel value decomposition framework that infers hidden states from local observations, enabling effective multi-agent reinforcement learning in partially observable environments without requiring full state information.

Contribution

The paper proposes SIDE, a new framework that combines state inference with value decomposition, allowing multi-agent RL in partially observable settings without access to global states.

Findings

01

SIDE can accurately infer current states from local observations.

02

It outperforms several baselines in complex StarCraft II tasks.

03

The method is extendable to various value decomposition algorithms.

Abstract

As one of the solutions to the decentralized partially observable Markov decision process (Dec-POMDP) problems, the value decomposition method has achieved significant results recently. However, most value decomposition methods require the fully observable state of the environment during training, but this is not feasible in some scenarios where only incomplete and noisy observations can be obtained. Therefore, we propose a novel value decomposition framework, named State Inference for value DEcomposition (SIDE), which eliminates the need to know the global state by simultaneously seeking solutions to the two problems of optimal control and state inference. SIDE can be extended to any value decomposition method to tackle partially observable problems. By comparing with the performance of different algorithms in StarCraft II micromanagement tasks, we verified that though without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics