SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning
Zhiwei Xu, Yunpeng Bai, Dapeng Li, Bin Zhang, Guoliang Fan

TL;DR
SIDE introduces a novel value decomposition framework that infers hidden states from local observations, enabling effective multi-agent reinforcement learning in partially observable environments without requiring full state information.
Contribution
The paper proposes SIDE, a new framework that combines state inference with value decomposition, allowing multi-agent RL in partially observable settings without access to global states.
Findings
SIDE can accurately infer current states from local observations.
It outperforms several baselines in complex StarCraft II tasks.
The method is extendable to various value decomposition algorithms.
Abstract
As one of the solutions to the decentralized partially observable Markov decision process (Dec-POMDP) problems, the value decomposition method has achieved significant results recently. However, most value decomposition methods require the fully observable state of the environment during training, but this is not feasible in some scenarios where only incomplete and noisy observations can be obtained. Therefore, we propose a novel value decomposition framework, named State Inference for value DEcomposition (SIDE), which eliminates the need to know the global state by simultaneously seeking solutions to the two problems of optimal control and state inference. SIDE can be extended to any value decomposition method to tackle partially observable problems. By comparing with the performance of different algorithms in StarCraft II micromanagement tasks, we verified that though without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
