Learning to Model Opponent Learning

Ian Davies; Zheng Tian; Jun Wang

arXiv:2006.03923·cs.LG·June 9, 2020·1 cites

Learning to Model Opponent Learning

Ian Davies, Zheng Tian, Jun Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces LeMOL, a novel method for modeling the learning dynamics of opponents in multi-agent reinforcement learning, addressing non-stationarity and improving agent performance.

Contribution

LeMOL is a new structured opponent modeling approach that captures opponent learning dynamics, surpassing naive behavior cloning in accuracy and stability.

Findings

01

LeMOL outperforms behavior cloning baselines in modeling accuracy.

02

Opponent modeling with LeMOL enhances multi-agent algorithmic performance.

03

Structured opponent models better handle non-stationarity in MARL environments.

Abstract

Multi-Agent Reinforcement Learning (MARL) considers settings in which a set of coexisting agents interact with one another and their environment. The adaptation and learning of other agents induces non-stationarity in the environment dynamics. This poses a great challenge for value function-based algorithms whose convergence usually relies on the assumption of a stationary environment. Policy search algorithms also struggle in multi-agent settings as the partial observability resulting from an opponent's actions not being known introduces high variance to policy training. Modelling an agent's opponent(s) is often pursued as a means of resolving the issues arising from the coexistence of learning opponents. An opponent model provides an agent with some ability to reason about other agents to aid its own decision making. Most prior works learn an opponent model by assuming the opponent is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ianRDavies/LeMOL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Domain Adaptation and Few-Shot Learning