On the Fundamental Limitations of Decentralized Learnable Reward Shaping in Cooperative Multi-Agent Reinforcement Learning
Aditya Akella

TL;DR
This paper investigates the limitations of decentralized learnable reward shaping in cooperative multi-agent reinforcement learning, revealing fundamental barriers that prevent decentralized methods from matching centralized approaches in complex tasks.
Contribution
It introduces DMARL-RSA, a decentralized reward shaping system, and empirically demonstrates its limitations compared to centralized training, highlighting key challenges in decentralized multi-agent coordination.
Findings
Decentralized reward shaping underperforms centralized methods by over 26 points in average reward.
Decentralized methods achieve higher landmark coverage but worse overall task performance.
Three barriers identified: non-stationarity, credit assignment complexity, and reward-objective misalignment.
Abstract
Recent advances in learnable reward shaping have shown promise in single-agent reinforcement learning by automatically discovering effective feedback signals. However, the effectiveness of decentralized learnable reward shaping in cooperative multi-agent settings remains poorly understood. We propose DMARL-RSA, a fully decentralized system where each agent learns individual reward shaping, and evaluate it on cooperative navigation tasks in the simple_spread_v3 environment. Despite sophisticated reward learning, DMARL-RSA achieves only -24.20 +/- 0.09 average reward, compared to MAPPO with centralized training at 1.92 +/- 0.87 -- a 26.12-point gap. DMARL-RSA performs similarly to simple independent learning (IPPO: -23.19 +/- 0.96), indicating that advanced reward shaping cannot overcome fundamental decentralized coordination limitations. Interestingly, decentralized methods achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
