Learning from Imperfect Demonstrations via Adversarial Confidence Transfer
Zhangjie Cao, Zihan Wang, Dorsa Sadigh

TL;DR
This paper introduces a method to learn effective policies from imperfect demonstrations by transferring confidence estimates across environments using adversarial training, improving performance in real-world tasks.
Contribution
It proposes a novel adversarial approach to transfer confidence predictors from source to target environments, enabling learning from suboptimal demonstrations.
Findings
Achieves higher expected return in simulated environments.
Effective transfer of confidence improves policy learning.
Demonstrates success on a real robot reaching task.
Abstract
Existing learning from demonstration algorithms usually assume access to expert demonstrations. However, this assumption is limiting in many real-world applications since the collected demonstrations may be suboptimal or even consist of failure cases. We therefore study the problem of learning from imperfect demonstrations by learning a confidence predictor. Specifically, we rely on demonstrations along with their confidence values from a different correspondent environment (source environment) to learn a confidence predictor for the environment we aim to learn a policy in (target environment -- where we only have unlabeled demonstrations.) We learn a common latent space through adversarial distribution matching of multi-length partial trajectories to enable the transfer of confidence across source and target environments. The learned confidence reweights the demonstrations to enable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Machine Learning and Data Classification
