Imitation Learning by Reinforcement Learning

Kamil Ciosek

arXiv:2108.04763·stat.ML·March 16, 2022

Imitation Learning by Reinforcement Learning

Kamil Ciosek

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a novel approach to imitation learning by reducing it to reinforcement learning with a stationary reward, providing theoretical guarantees and practical effectiveness for continuous control tasks.

Contribution

It introduces a reduction method for imitation learning from deterministic experts to reinforcement learning, with theoretical analysis and empirical validation.

Findings

01

The reduction effectively recovers expert rewards.

02

The method bounds the total variation distance between expert and learner.

03

Experimental results confirm practical success in continuous control tasks.

Abstract

Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spotify-research/il-by-rl
pytorchOfficial

Videos

Imitation Learning by Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Robot Manipulation and Learning