Selective Sampling and Imitation Learning via Online Regression
Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu

TL;DR
This paper introduces an interactive selective sampling algorithm for imitation learning that actively queries noisy experts, achieving near-optimal regret and query bounds with limited expert feedback.
Contribution
It presents a novel selective sampling method for imitation learning with noisy feedback, extending to general function classes and providing tight theoretical bounds.
Findings
Achieves best-known regret and query bounds for noisy imitation learning.
Extends selective sampling to general function approximation with theoretical guarantees.
Provides lower bounds demonstrating the tightness of the results.
Abstract
We consider the problem of Imitation Learning (IL) by actively querying noisy expert for feedback. While imitation learning has been empirically successful, much of prior work assumes access to noiseless expert feedback which is not practical in many applications. In fact, when one only has access to noisy expert feedback, algorithms that rely on purely offline data (non-interactive IL) can be shown to need a prohibitively large number of samples to be successful. In contrast, in this work, we provide an interactive algorithm for IL that uses selective sampling to actively query the noisy expert for feedback. Our contributions are twofold: First, we provide a new selective sampling algorithm that works with general function classes and multiple actions, and obtains the best-known bounds for the regret and the number of queries. Next, we extend this analysis to the problem of IL with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning
