A Strong Baseline for Batch Imitation Learning

Matthew Smith; Lucas Maystre; Zhenwen Dai; Kamil Ciosek

arXiv:2302.02788·cs.LG·February 7, 2023

A Strong Baseline for Batch Imitation Learning

Matthew Smith, Lucas Maystre, Zhenwen Dai, Kamil Ciosek

PDF

Open Access

TL;DR

This paper introduces a simple, hyper-parameter-free batch imitation learning algorithm with formal guarantees, a robust evaluation protocol, and competitive performance on continuous control benchmarks, suitable for safety-critical applications.

Contribution

It presents a novel, easy-to-implement imitation learning algorithm with theoretical guarantees and a new evaluation protocol for offline RL.

Findings

01

Algorithm achieves competitive results on continuous control tasks.

02

Provides formal sample complexity guarantees for the proposed method.

03

Establishes a fair evaluation protocol for offline reinforcement learning.

Abstract

Imitation of expert behaviour is a highly desirable and safe approach to the problem of sequential decision making. We provide an easy-to-implement, novel algorithm for imitation learning under a strict data paradigm, in which the agent must learn solely from data collected a priori. This paradigm allows our algorithm to be used for environments in which safety or cost are of critical concern. Our algorithm requires no additional hyper-parameter tuning beyond any standard batch reinforcement learning (RL) algorithm, making it an ideal baseline for such data-strict regimes. Furthermore, we provide formal sample complexity guarantees for the algorithm in finite Markov Decision Problems. In doing so, we formally demonstrate an unproven claim from Kearns & Singh (1998). On the empirical side, our contribution is twofold. First, we develop a practical, robust and principled evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification