Strangeness-driven Exploration in Multi-Agent Reinforcement Learning
Ju-Bong Kim, Ho-Bin Choi, Youn-Hee Han

TL;DR
This paper introduces a novel exploration method based on 'strangeness' that enhances cooperative multi-agent reinforcement learning by improving exploration efficiency and stability, demonstrated through experiments on StarCraft and other benchmarks.
Contribution
The paper proposes a new strangeness-based exploration technique that can be integrated into any CTDE-based MARL algorithm, improving exploration and stability.
Findings
Significant performance improvements in StarCraft Multi-Agent Challenge
Robustness to stochastic transitions in MARL tasks
Enhanced stability of training with exploration bonus
Abstract
Efficient exploration strategy is one of essential issues in cooperative multi-agent reinforcement learning (MARL) algorithms requiring complex coordination. In this study, we introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms. The strangeness refers to the degree of unfamiliarity of the observations that an agent visits. In order to give the observation strangeness a global perspective, it is also augmented with the the degree of unfamiliarity of the visited entire state. The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by stochastic transitions commonly observed in MARL tasks. To prevent a high exploration bonus from making the MARL training insensitive to extrinsic rewards, we also propose a separate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications
