SS-MAIL: Self-Supervised Multi-Agent Imitation Learning
Akshay Dharmavaram, Tejus Gupta, Jiachen Li, Katia P. Sycara

TL;DR
SS-MAIL introduces a self-supervised approach for multi-agent imitation learning that stabilizes training, models multi-modal behaviors, and improves sample efficiency through curriculum learning, outperforming prior methods on real and synthetic tasks.
Contribution
It proposes a novel self-supervised loss for AIL, a graph-based multi-agent architecture, and a curriculum method called Trajectory Forcing, advancing multi-agent imitation learning.
Findings
Outperforms prior state-of-the-art methods on real-world tasks
Provides a theoretical link to cost-regularized apprenticeship learning
Enhances sample efficiency with Trajectory Forcing curriculum
Abstract
The current landscape of multi-agent expert imitation is broadly dominated by two families of algorithms - Behavioral Cloning (BC) and Adversarial Imitation Learning (AIL). BC approaches suffer from compounding errors, as they ignore the sequential decision-making nature of the trajectory generation problem. Furthermore, they cannot effectively model multi-modal behaviors. While AIL methods solve the issue of compounding errors and multi-modal policy training, they are plagued with instability in their training dynamics. In this work, we address this issue by introducing a novel self-supervised loss that encourages the discriminator to approximate a richer reward function. We employ our method to train a graph-based multi-agent actor-critic architecture that learns a centralized policy, conditioned on a learned latent interaction graph. We show that our method (SS-MAIL) outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Autonomous Vehicle Technology and Safety · Human Pose and Action Recognition
