Is Behavior Cloning All You Need? Understanding Horizon in Imitation   Learning

Dylan J. Foster; Adam Block; Dipendra Misra

arXiv:2407.15007·cs.LG·December 3, 2024·1 cites

Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

Dylan J. Foster, Adam Block, Dipendra Misra

PDF

Open Access

TL;DR

This paper analyzes the horizon dependence in imitation learning, showing conditions under which offline behavior cloning can achieve horizon-independent sample complexity and comparing it to online methods.

Contribution

It provides a new theoretical analysis of behavior cloning with logarithmic loss, revealing conditions for horizon-independent sample complexity in offline IL and clarifying the offline-online gap.

Findings

01

Offline IL can achieve horizon-independent sample complexity under certain conditions.

02

The gap between offline and online IL is smaller than previously thought, especially with dense rewards.

03

Online IL cannot outperform offline IL with logarithmic loss in benign MDPs without additional assumptions.

Abstract

Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The simplest approach to IL, behavior cloning (BC), is thought to incur sample complexity with unfavorable quadratic dependence on the problem horizon, motivating a variety of different online algorithms that attain improved linear horizon dependence under stronger assumptions on the data and the learner's access to the expert. We revisit the apparent gap between offline and online IL from a learning-theoretic perspective, with a focus on the realizable/well-specified setting with general policy classes up to and including deep neural networks. Through a new analysis of behavior cloning with the logarithmic loss, we show that it is possible to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Language and cultural evolution

MethodsFocus