Modeling Others using Oneself in Multi-Agent Reinforcement Learning
Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus

TL;DR
This paper introduces Self Other-Modeling (SOM), a novel method in multi-agent reinforcement learning where agents predict others' hidden goals using their own policy, improving decision-making in various cooperative and adversarial tasks.
Contribution
The paper presents a new approach called Self Other-Modeling (SOM) that enables agents to infer others' hidden states using their own policy, enhancing multi-agent learning.
Findings
Agents using SOM learn better policies in tested tasks.
SOM improves inference of other agents' hidden goals.
Effective in both cooperative and adversarial settings.
Abstract
We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other players' hidden goals from their observed behavior in order to solve the tasks. We propose a new approach for learning in these domains: Self Other-Modeling (SOM), in which an agent uses its own policy to predict the other agent's actions and update its belief of their hidden state in an online manner. We evaluate this approach on three different tasks and show that the agents are able to learn better policies using their estimate of the other players' hidden states, in both cooperative and adversarial settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Game Theory and Applications
