A Historical Interaction-Enhanced Shapley Policy Gradient Algorithm for Multi-Agent Credit Assignment
Ao Ding, Licheng Sun, Yongjie Hou, Huaqing Zhang, Hongbin Ma

TL;DR
This paper introduces HIS, a novel multi-agent reinforcement learning algorithm that uses historical interaction data and Shapley values to improve credit assignment, stability, and performance in complex collaborative tasks.
Contribution
HIS is the first to integrate historical interaction-enhanced Shapley value calculation with a hybrid credit mechanism for stable, efficient multi-agent credit assignment.
Findings
HIS outperforms existing methods in complex benchmark environments.
The hybrid credit mechanism improves training stability and generalization.
Theoretical guarantees ensure efficiency and stability of credit assignment.
Abstract
Multi-agent reinforcement learning (MARL) has demonstrated remarkable performance in multi-agent collaboration problems and has become a prominent topic in artificial intelligence research in recent years. However, traditional credit assignment schemes in MARL cannot reliably capture individual contributions in strongly coupled tasks while maintaining training stability, which leads to limited generalization capabilities and hinders algorithm performance. To address these challenges, we propose a Historical Interaction-Enhanced Shapley Policy Gradient Algorithm (HIS) for Multi-Agent Credit Assignment, which employs a hybrid credit assignment mechanism to balance base rewards with individual contribution incentives. By utilizing historical interaction data to calculate the Shapley value in a sample-efficient manner, HIS enhances the agent's ability to perceive its own contribution, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Advanced Graph Neural Networks
