Decision-making with Speculative Opponent Models

Jing Sun; Shuo Chen; Cong Zhang; Yining Ma; Jie Zhang

arXiv:2211.11940·cs.AI·March 25, 2024

Decision-making with Speculative Opponent Models

Jing Sun, Shuo Chen, Cong Zhang, Yining Ma, Jie Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces DOMAC, a novel multi-agent reinforcement learning algorithm that models opponents using only local information, improving decision-making and performance in complex multi-agent environments.

Contribution

We propose DOMAC, the first speculative opponent modelling algorithm that relies solely on local information, with distributional critics and a derived policy gradient theorem.

Findings

01

DOMAC outperforms state-of-the-art methods in multiple benchmarks.

02

DOMAC achieves faster convergence in complex multi-agent tasks.

03

DOMAC effectively models opponent behaviors in challenging environments.

Abstract

Opponent modelling has proven effective in enhancing the decision-making of the controlled agent by constructing models of opponent agents. However, existing methods often rely on access to the observations and actions of opponents, a requirement that is infeasible when such information is either unobservable or challenging to obtain. To address this issue, we introduce Distributional Opponent-aided Multi-agent Actor-Critic (DOMAC), the first speculative opponent modelling algorithm that relies solely on local information (i.e., the controlled agent's observations, actions, and rewards). Specifically, the actor maintains a speculated belief about the opponents using the tailored speculative opponent models that predict the opponents' actions using only local information. Moreover, DOMAC features distributional critic models that estimate the return distribution of the actor's policy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunjing1102628/DOMAC
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Anomaly Detection Techniques and Applications