SS-MAIL: Self-Supervised Multi-Agent Imitation Learning

Akshay Dharmavaram; Tejus Gupta; Jiachen Li; Katia P. Sycara

arXiv:2110.08963·cs.AI·October 19, 2021

SS-MAIL: Self-Supervised Multi-Agent Imitation Learning

Akshay Dharmavaram, Tejus Gupta, Jiachen Li, Katia P. Sycara

PDF

Open Access

TL;DR

SS-MAIL introduces a self-supervised approach for multi-agent imitation learning that stabilizes training, models multi-modal behaviors, and improves sample efficiency through curriculum learning, outperforming prior methods on real and synthetic tasks.

Contribution

It proposes a novel self-supervised loss for AIL, a graph-based multi-agent architecture, and a curriculum method called Trajectory Forcing, advancing multi-agent imitation learning.

Findings

01

Outperforms prior state-of-the-art methods on real-world tasks

02

Provides a theoretical link to cost-regularized apprenticeship learning

03

Enhances sample efficiency with Trajectory Forcing curriculum

Abstract

The current landscape of multi-agent expert imitation is broadly dominated by two families of algorithms - Behavioral Cloning (BC) and Adversarial Imitation Learning (AIL). BC approaches suffer from compounding errors, as they ignore the sequential decision-making nature of the trajectory generation problem. Furthermore, they cannot effectively model multi-modal behaviors. While AIL methods solve the issue of compounding errors and multi-modal policy training, they are plagued with instability in their training dynamics. In this work, we address this issue by introducing a novel self-supervised loss that encourages the discriminator to approximate a richer reward function. We employ our method to train a graph-based multi-agent actor-critic architecture that learns a centralized policy, conditioned on a learned latent interaction graph. We show that our method (SS-MAIL) outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Autonomous Vehicle Technology and Safety · Human Pose and Action Recognition