Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
Pihe Hu, Shaolong Li, Zhuoran Li, Ling Pan, Longbo Huang

TL;DR
This paper introduces MAST, a novel sparse training framework for deep multi-agent reinforcement learning that significantly reduces computational costs while maintaining high performance.
Contribution
It develops the MAST framework with a Soft Mellowmax Operator, dual replay buffers, and gradient-based topology evolution to improve sparse MARL training reliability and efficiency.
Findings
Achieves up to 20x reduction in FLOPs for training and inference.
Maintains less than 3% performance degradation.
Demonstrates effectiveness across various algorithms and benchmarks.
Abstract
Deep Multi-agent Reinforcement Learning (MARL) relies on neural networks with numerous parameters in multi-agent scenarios, often incurring substantial computational overhead. Consequently, there is an urgent need to expedite training and enable model compression in MARL. This paper proposes the utilization of dynamic sparse training (DST), a technique proven effective in deep supervised learning tasks, to alleviate the computational burdens in MARL training. However, a direct adoption of DST fails to yield satisfactory MARL agents, leading to breakdowns in value learning within deep sparse value-based MARL models. Motivated by this challenge, we introduce an innovative Multi-Agent Sparse Training (MAST) framework aimed at simultaneously enhancing the reliability of learning targets and the rationality of sample distribution to improve value learning in sparse models. Specifically, MAST…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Advanced Technologies in Various Fields
MethodsDynamic Sparse Training
