
TL;DR
This paper introduces Searching with Opponent-Awareness (SOA), a novel multi-agent planning method that improves performance by leveraging opponent-aware strategies without needing explicit opponent models.
Contribution
The paper develops an opponent-aware MCTS scheme using multi-armed bandits based on LOLA, demonstrating its effectiveness over other bandits in multi-agent settings.
Findings
SOA performs better with more agents.
Opponent-awareness enhances multi-agent planning.
SOA outperforms UCB1 in evaluations.
Abstract
We propose Searching with Opponent-Awareness (SOA), an approach to leverage opponent-aware planning without explicit or a priori opponent models for improving performance and social welfare in multi-agent systems. To this end, we develop an opponent-aware MCTS scheme using multi-armed bandits based on Learning with Opponent-Learning Awareness (LOLA) and compare its effectiveness with other bandits, including UCB1. Our evaluations include several different settings and show the benefits of SOA are especially evident with increasing number of agents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Artificial Intelligence in Games
