Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning
Aditya Kapoor, Sushant Swamy, Kale-ab Tessera, Mayank Baranwal,, Mingfei Sun, Harshad Khadilkar, Stefano V. Albrecht

TL;DR
This paper introduces TAR$^2$, a reward redistribution method that improves learning stability and speed in multi-agent reinforcement learning with sparse rewards by decomposing global rewards into agent-specific, time-step-specific signals.
Contribution
The paper proposes TAR$^2$, a novel reward redistribution technique that addresses agent-temporal credit assignment while preserving optimal policies, supported by theoretical proof and empirical validation.
Findings
TAR$^2$ stabilizes and accelerates learning.
When combined with single-agent algorithms, TAR$^2$ matches or outperforms traditional multi-agent methods.
TAR$^2$ is equivalent to potential-based reward shaping.
Abstract
In multi-agent environments, agents often struggle to learn optimal policies due to sparse or delayed global rewards, particularly in long-horizon tasks where it is challenging to evaluate actions at intermediate time steps. We introduce Temporal-Agent Reward Redistribution (TAR), a novel approach designed to address the agent-temporal credit assignment problem by redistributing sparse rewards both temporally and across agents. TAR decomposes sparse global rewards into time-step-specific rewards and calculates agent-specific contributions to these rewards. We theoretically prove that TAR is equivalent to potential-based reward shaping, ensuring that the optimal policy remains unchanged. Empirical results demonstrate that TAR stabilizes and accelerates the learning process. Additionally, we show that when TAR is integrated with single-agent reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management
