Learning from Imperfect Demonstrations via Adversarial Confidence   Transfer

Zhangjie Cao; Zihan Wang; Dorsa Sadigh

arXiv:2202.02967·cs.RO·March 3, 2022

Learning from Imperfect Demonstrations via Adversarial Confidence Transfer

Zhangjie Cao, Zihan Wang, Dorsa Sadigh

PDF

Open Access

TL;DR

This paper introduces a method to learn effective policies from imperfect demonstrations by transferring confidence estimates across environments using adversarial training, improving performance in real-world tasks.

Contribution

It proposes a novel adversarial approach to transfer confidence predictors from source to target environments, enabling learning from suboptimal demonstrations.

Findings

01

Achieves higher expected return in simulated environments.

02

Effective transfer of confidence improves policy learning.

03

Demonstrates success on a real robot reaching task.

Abstract

Existing learning from demonstration algorithms usually assume access to expert demonstrations. However, this assumption is limiting in many real-world applications since the collected demonstrations may be suboptimal or even consist of failure cases. We therefore study the problem of learning from imperfect demonstrations by learning a confidence predictor. Specifically, we rely on demonstrations along with their confidence values from a different correspondent environment (source environment) to learn a confidence predictor for the environment we aim to learn a policy in (target environment -- where we only have unlabeled demonstrations.) We learn a common latent space through adversarial distribution matching of multi-length partial trajectories to enable the transfer of confidence across source and target environments. The learned confidence reweights the demonstrations to enable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Machine Learning and Data Classification