AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization
Kailash Gogineni, Yongsheng Mei, Peng Wei, Tian Lan, Guru, Venkataramani

TL;DR
AccMER enhances multi-agent reinforcement learning efficiency by reusing high-priority experiences to improve cache locality, reducing training time by over 25% without sacrificing reward performance.
Contribution
The paper introduces AccMER, a novel method that reuses high-priority experiences over multiple steps to improve cache efficiency in multi-agent experience replay.
Findings
Achieves 25.4% reduction in training time on Predator-Prey environment.
Maintains comparable mean reward performance to existing algorithms.
Effectively improves cache locality and reduces data movement in multi-agent RL.
Abstract
Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralized execution~(CTDE) paradigm. Current multi-agent reinforcement learning~(MARL) algorithms consider experience replay with uniform sampling or based on priority weights to improve transition data sample efficiency in the sampling phase. However, moving transition data histories for each agent through the processor memory hierarchy is a performance limiter. Also, as the agents' transitions continuously renew every iteration, the finite cache capacity results in increased cache misses. To this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Energy Management · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
