Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality
Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

TL;DR
This paper introduces Confidence-Aware Imitation Learning (CAIL), a framework that learns effective policies from demonstrations with varying optimality by jointly estimating confidence scores and policy performance, outperforming existing methods.
Contribution
The paper proposes a novel CAIL framework that jointly learns confidence scores and policies from non-optimal demonstrations, with theoretical guarantees and superior empirical results.
Findings
CAIL outperforms existing imitation learning methods in simulated and real robot experiments.
CAIL can learn successful policies even without access to optimal demonstrations.
Theoretical guarantees ensure convergence of the proposed framework.
Abstract
Most existing imitation learning approaches assume the demonstrations are drawn from experts who are optimal, but relaxing this assumption enables us to use a wider range of data. Standard imitation learning may learn a suboptimal policy from demonstrations with varying optimality. Prior works use confidence scores or rankings to capture beneficial information from demonstrations with varying optimality, but they suffer from many limitations, e.g., manually annotated confidence scores or high average optimality of demonstrations. In this paper, we propose a general framework to learn from demonstrations with varying optimality that jointly learns the confidence score and a well-performing policy. Our approach, Confidence-Aware Imitation Learning (CAIL) learns a well-performing policy from confidence-reweighted demonstrations, while using an outer loss to track the performance of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Machine Learning and Algorithms
