Influence-Based Multi-Agent Exploration
Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

TL;DR
This paper introduces two influence-based exploration methods for multi-agent reinforcement learning, leveraging interaction dynamics to improve coordinated exploration and team performance in sparse-reward environments.
Contribution
It proposes EITI and EDTI, novel influence-based intrinsic rewards that enhance exploration by exploiting agent interactions, integrated with policy gradient methods.
Findings
Significant performance improvements in multi-agent tasks.
Effective coordination of exploration among agents.
Robustness across various multi-agent scenarios.
Abstract
Intrinsically motivated reinforcement learning aims to address the exploration challenge for sparse-reward tasks. However, the study of exploration methods in transition-dependent multi-agent settings is largely absent from the literature. We aim to take a step towards solving this problem. We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents. EITI uses mutual information to capture influence transition dynamics. EDTI uses a novel intrinsic reward, called Value of Interaction (VoI), to characterize and quantify the influence of one agent's behavior on expected returns of other agents. By optimizing EITI or EDTI objective as a regularizer, agents are encouraged to coordinate their exploration and learn policies to optimize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Experimental Behavioral Economics Studies
