CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning
Vasu Sharma, Prasoon Goyal, Kaixiang Lin, Govind Thattai, Qiaozi Gao,, Gaurav S. Sukhatme

TL;DR
This paper introduces a multimodal benchmark for cooperative multi-agent reinforcement learning involving vision and language modalities, providing datasets, frameworks, and evaluations to advance research in heterogeneous multi-robot collaboration.
Contribution
It presents a novel multimodal dataset, an integrated learning framework, and evaluation protocols for heterogeneous multi-agent reinforcement learning in complex environments.
Findings
Multimodality poses unique challenges for multi-agent learning.
Existing methods show room for improvement in multimodal cooperative tasks.
The benchmark facilitates systematic evaluation of multi-agent RL methods.
Abstract
We propose a multimodal (vision-and-language) benchmark for cooperative and heterogeneous multi-agent learning. We introduce a benchmark multimodal dataset with tasks involving collaboration between multiple simulated heterogeneous robots in a rich multi-room home environment. We provide an integrated learning framework, multimodal implementations of state-of-the-art multi-agent reinforcement learning techniques, and a consistent evaluation protocol. Our experiments investigate the impact of different modalities on multi-agent learning performance. We also introduce a simple message passing method between agents. The results suggest that multimodality introduces unique challenges for cooperative multi-agent learning and there is significant room for advancing multi-agent reinforcement learning methods in such settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
