Conformalized Interactive Imitation Learning: Handling Expert Shift and   Intermittent Feedback

Michelle Zhao; Reid Simmons; Henny Admoni; Aaditya Ramdas; Andrea; Bajcsy

arXiv:2410.08852·cs.RO·May 1, 2025

Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback

Michelle Zhao, Reid Simmons, Henny Admoni, Aaditya Ramdas, Andrea, Bajcsy

PDF

Open Access 1 Video

TL;DR

This paper introduces ConformalDAgger, a novel interactive imitation learning method that uses conformal prediction-based uncertainty quantification to adapt to expert policy shifts and intermittent feedback, improving learning efficiency.

Contribution

It develops IQT, a new uncertainty quantification algorithm for intermittent labels, and integrates it into ConformalDAgger for improved active feedback querying during deployment.

Findings

01

ConformalDAgger detects high uncertainty during expert shifts.

02

It increases interventions to learn new behaviors faster.

03

It outperforms prior methods in simulated and real robotic tasks.

Abstract

In interactive imitation learning (IL), uncertainty quantification offers a way for the learner (i.e. robot) to contend with distribution shifts encountered during deployment by actively seeking additional feedback from an expert (i.e. human) online. Prior works use mechanisms like ensemble disagreement or Monte Carlo dropout to quantify when black-box IL policies are uncertain; however, these approaches can lead to overconfident estimates when faced with deployment-time distribution shifts. Instead, we contend that we need uncertainty quantification algorithms that can leverage the expert human feedback received during deployment time to adapt the robot's uncertainty online. To tackle this, we draw upon online conformal prediction, a distribution-free method for constructing prediction intervals online given a stream of ground-truth labels. Human labels, however, are intermittent in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback· slideslive

Taxonomy

TopicsInnovative Teaching and Learning Methods · Human Motion and Animation · Robot Manipulation and Learning

MethodsMonte Carlo Dropout · Dropout