AccMER: Accelerating Multi-Agent Experience Replay with Cache   Locality-aware Prioritization

Kailash Gogineni; Yongsheng Mei; Peng Wei; Tian Lan; Guru; Venkataramani

arXiv:2306.00187·cs.MA·June 2, 2023·2 cites

AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization

Kailash Gogineni, Yongsheng Mei, Peng Wei, Tian Lan, Guru, Venkataramani

PDF

Open Access

TL;DR

AccMER enhances multi-agent reinforcement learning efficiency by reusing high-priority experiences to improve cache locality, reducing training time by over 25% without sacrificing reward performance.

Contribution

The paper introduces AccMER, a novel method that reuses high-priority experiences over multiple steps to improve cache efficiency in multi-agent experience replay.

Findings

01

Achieves 25.4% reduction in training time on Predator-Prey environment.

02

Maintains comparable mean reward performance to existing algorithms.

03

Effectively improves cache locality and reduces data movement in multi-agent RL.

Abstract

Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralized execution~(CTDE) paradigm. Current multi-agent reinforcement learning~(MARL) algorithms consider experience replay with uniform sampling or based on priority weights to improve transition data sample efficiency in the sampling phase. However, moving transition data histories for each agent through the processor memory hierarchy is a performance limiter. Also, as the agents' transitions continuously renew every iteration, the finite cache capacity results in increased cache misses. To this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Grid Energy Management · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research