TL;DR
The paper introduces MAICC, a decentralized memory retrieval method that improves coordination and adaptation speed in multi-agent reinforcement learning tasks by leveraging trajectory embeddings and a hybrid utility score.
Contribution
It proposes a novel decentralized memory mechanism and embedding-based retrieval approach to enhance coordination and fast adaptation in cooperative MARL.
Findings
MAICC achieves faster adaptation on LBF and SMAC benchmarks.
The method improves coordination by using team-level trajectory retrieval.
Code is publicly available at the provided GitHub URL.
Abstract
Large transformer models, trained on diverse datasets, have demonstrated impressive few-shot performance on previously unseen tasks without requiring parameter updates. This capability has also been explored in Reinforcement Learning (RL), where agents interact with the environment to retrieve context and maximize cumulative rewards, showcasing strong adaptability in complex settings. However, in cooperative Multi-Agent Reinforcement Learning (MARL), where agents must coordinate toward a shared goal, decentralized policy deployment can lead to mismatches in task alignment and reward assignment, limiting the efficiency of policy adaptation. To address this challenge, we introduce Multi-agent In-context Coordination via Decentralized Memory Retrieval (MAICC), a novel approach designed to enhance coordination by fast adaptation. Our method involves training a centralized embedding model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
