Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan

TL;DR
This paper introduces a probabilistic recursive reasoning framework for multi-agent reinforcement learning, enabling agents to model and respond to opponents' beliefs about their own future actions, improving convergence in complex games.
Contribution
It proposes a novel probabilistic recursive reasoning approach using variational Bayes, with algorithms that converge to Nash equilibria in multi-agent settings.
Findings
PR2 algorithms outperform gradient-based methods in complex games.
Reasoning about opponents' beliefs enhances multi-agent learning stability.
The framework successfully converges in non-trivial equilibrium scenarios.
Abstract
Humans are capable of attributing latent mental contents such as beliefs or intentions to others. The social skill is critical in daily life for reasoning about the potential consequences of others' behaviors so as to plan ahead. It is known that humans use such reasoning ability recursively by considering what others believe about their own beliefs. In this paper, we start from level- recursion and introduce a probabilistic recursive reasoning (PR2) framework for multi-agent reinforcement learning. Our hypothesis is that it is beneficial for each agent to account for how the opponents would react to its future behaviors. Under the PR2 framework, we adopt variational Bayes methods to approximate the opponents' conditional policies, to which each agent finds the best response and then improve their own policies. We develop decentralized-training-decentralized-execution algorithms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Game Theory and Applications
