Imitation Learning by Reinforcement Learning
Kamil Ciosek

TL;DR
This paper presents a novel approach to imitation learning by reducing it to reinforcement learning with a stationary reward, providing theoretical guarantees and practical effectiveness for continuous control tasks.
Contribution
It introduces a reduction method for imitation learning from deterministic experts to reinforcement learning, with theoretical analysis and empirical validation.
Findings
The reduction effectively recovers expert rewards.
The method bounds the total variation distance between expert and learner.
Experimental results confirm practical success in continuous control tasks.
Abstract
Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Robot Manipulation and Learning
