Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning

Yunke Wang; Bo Du; Chang Xu

arXiv:2302.06271·cs.LG·February 14, 2023

Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning

Yunke Wang, Bo Du, Chang Xu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a positive-unlabeled adversarial imitation learning method that effectively learns from imperfect, unlabeled expert demonstrations, improving policy training in complex environments.

Contribution

It proposes a novel algorithm that handles unlabeled, imperfect demonstrations in adversarial imitation learning, with theoretical analysis and practical validation.

Findings

01

Effective learning from imperfect demonstrations demonstrated on MuJoCo and RoboSuite.

02

The method adapts dynamically to non-optimal expert data.

03

Theoretical analysis confirms self-paced learning from unlabeled data.

Abstract

Adversarial imitation learning has become a widely used imitation learning framework. The discriminator is often trained by taking expert demonstrations and policy trajectories as examples respectively from two categories (positive vs. negative) and the policy is then expected to produce trajectories that are indistinguishable from the expert demonstrations. But in the real world, the collected expert demonstrations are more likely to be imperfect, where only an unknown fraction of the demonstrations are optimal. Instead of treating imperfect expert demonstrations as absolutely positive or negative, we investigate unlabeled imperfect expert demonstrations as they are. A positive-unlabeled adversarial imitation learning algorithm is developed to dynamically sample expert demonstrations that can well match the trajectories from the constantly optimized agent policy. The trajectories of an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yunke-wang/uid
pytorchOfficial

Videos

Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Reinforcement Learning in Robotics