On Efficient Online Imitation Learning via Classification

Yichen Li; Chicheng Zhang

arXiv:2209.12868·cs.LG·September 27, 2022·1 cites

On Efficient Online Imitation Learning via Classification

Yichen Li, Chicheng Zhang

PDF

Open Access 1 Video

TL;DR

This paper investigates the limits and possibilities of classification-based online imitation learning, proposing new algorithms that improve sample efficiency and analyzing fundamental computational barriers in the nonrealizable setting.

Contribution

It introduces the Logger framework for improper online learning in COIL, designs two oracle-efficient algorithms, and establishes theoretical limitations on dynamic regret minimization.

Findings

01

Proper online algorithms cannot guarantee sublinear regret in general.

02

The Logger framework reduces COIL to online linear optimization.

03

Proposed algorithms outperform naive behavior cloning in finite-sample settings.

Abstract

Imitation learning (IL) is a general learning paradigm for tackling sequential decision-making problems. Interactive imitation learning, where learners can interactively query for expert demonstrations, has been shown to achieve provably superior sample efficiency guarantees compared with its offline counterpart or reinforcement learning. In this work, we study classification-based online imitation learning (abbrev. $COIL$ ) and the fundamental feasibility to design oracle-efficient regret-minimization algorithms in this setting, with a focus on the general nonrealizable case. We make the following contributions: (1) we show that in the $COIL$ problem, any proper online learning algorithm cannot guarantee a sublinear regret in general; (2) we propose $Logger$ , an improper online learning algorithmic framework, that reduces $COIL$ to online linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Efficient Online Imitation Learning via Classification· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Machine Learning and Algorithms