Opponent Modeling in Deep Reinforcement Learning
He He, Jordan Boyd-Graber, Kevin Kwok, Hal Daum\'e III

TL;DR
This paper introduces neural-based opponent modeling in deep reinforcement learning that automatically discovers opponent strategies without supervision, improving performance in multi-agent games.
Contribution
It presents a novel Mixture-of-Experts architecture for opponent modeling that jointly learns policies and opponent behaviors in deep RL.
Findings
Outperforms standard DQN in simulated soccer and trivia games.
Automatically discovers diverse opponent strategies.
Enhances multi-agent RL performance without explicit opponent prediction.
Abstract
Opponent modeling is necessary in multi-agent settings where secondary agents with competing goals also adapt their strategies, yet it remains challenging because strategies interact with each other and change. Most previous work focuses on developing probabilistic models or parameterized strategies for specific applications. Inspired by the recent success of deep reinforcement learning, we present neural-based models that jointly learn a policy and the behavior of opponents. Instead of explicitly predicting the opponent's action, we encode observation of the opponents into a deep Q-Network (DQN); however, we retain explicit modeling (if desired) using multitasking. By using a Mixture-of-Experts architecture, our model automatically discovers different strategy patterns of opponents without extra supervision. We evaluate our models on a simulated soccer game and a popular trivia game,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Sports Analytics and Performance · Explainable Artificial Intelligence (XAI)
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
