Intrinsic Action Tendency Consistency for Cooperative Multi-Agent   Reinforcement Learning

Junkai Zhang; Yifan Zhang; Xi Sheryl Zhang; Yifan Zang; Jian Cheng

arXiv:2406.18152·cs.MA·August 30, 2024

Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning

Junkai Zhang, Yifan Zhang, Xi Sheryl Zhang, Yifan Zang, Jian Cheng

PDF

Open Access

TL;DR

This paper introduces a new intrinsic reward mechanism based on action tendency prediction to improve cooperation and training efficiency in multi-agent reinforcement learning within the CTDE framework.

Contribution

It proposes Intrinsic Action Tendency Consistency, integrating intrinsic rewards via an action model to enhance policy consensus among agents in cooperative RL.

Findings

01

Improved performance on SMAC and GRF benchmarks.

02

Theoretical proof of equivalence between RA-CTDE and CTDE.

03

Enhanced training efficiency with fewer samples.

Abstract

Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignments in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics