Multi-Agent Adversarial Inverse Reinforcement Learning
Lantao Yu, Jiaming Song, Stefano Ermon

TL;DR
This paper introduces MA-AIRL, a scalable multi-agent inverse reinforcement learning framework that effectively recovers reward functions and improves policy imitation in complex multi-agent environments.
Contribution
It presents a novel algorithm for multi-agent inverse reinforcement learning based on a new solution concept and adversarial reward learning, addressing high-dimensional state-action spaces.
Findings
MA-AIRL accurately recovers ground truth reward functions
It significantly outperforms prior methods in policy imitation
The framework is effective and scalable for complex Markov games
Abstract
Reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios. Inverse reinforcement learning provides a framework to automatically acquire suitable reward functions from expert demonstrations. Its extension to multi-agent settings, however, is difficult due to the more complex notions of rational behaviors. In this paper, we propose MA-AIRL, a new framework for multi-agent inverse reinforcement learning, which is effective and scalable for Markov games with high-dimensional state-action space and unknown dynamics. We derive our algorithm based on a new solution concept and maximum pseudolikelihood estimation within an adversarial reward learning framework. In the experiments, we demonstrate that MA-AIRL can recover reward functions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
