Influence-Based Multi-Agent Exploration

Tonghan Wang; Jianhao Wang; Yi Wu; Chongjie Zhang

arXiv:1910.05512·cs.LG·December 30, 2019·5 cites

Influence-Based Multi-Agent Exploration

Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces two influence-based exploration methods for multi-agent reinforcement learning, leveraging interaction dynamics to improve coordinated exploration and team performance in sparse-reward environments.

Contribution

It proposes EITI and EDTI, novel influence-based intrinsic rewards that enhance exploration by exploiting agent interactions, integrated with policy gradient methods.

Findings

01

Significant performance improvements in multi-agent tasks.

02

Effective coordination of exploration among agents.

03

Robustness across various multi-agent scenarios.

Abstract

Intrinsically motivated reinforcement learning aims to address the exploration challenge for sparse-reward tasks. However, the study of exploration methods in transition-dependent multi-agent settings is largely absent from the literature. We aim to take a step towards solving this problem. We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents. EITI uses mutual information to capture influence transition dynamics. EDTI uses a novel intrinsic reward, called Value of Interaction (VoI), to characterize and quantify the influence of one agent's behavior on expected returns of other agents. By optimizing EITI or EDTI objective as a regularizer, agents are encouraged to coordinate their exploration and learn policies to optimize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TonghanWang/EITI-EDTI
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Experimental Behavioral Economics Studies