Proximal Point Imitation Learning

Luca Viano; Angeliki Kamoutsi; Gergely Neu; Igor Krawczuk and; Volkan Cevher

arXiv:2209.10968·cs.LG·May 31, 2023

Proximal Point Imitation Learning

Luca Viano, Angeliki Kamoutsi, Gergely Neu, Igor Krawczuk and, Volkan Cevher

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces new algorithms for infinite horizon imitation learning with linear function approximation, leveraging proximal-point methods to improve efficiency and provide theoretical guarantees for both online and offline settings.

Contribution

It develops a unified proximal-point based framework for imitation learning that avoids nested policy evaluations and offers rigorous efficiency guarantees.

Findings

01

Achieves theoretical efficiency guarantees for online IL without nested evaluations.

02

Provides offline IL algorithm with guarantees using dual smoothing and expert trajectories.

03

Demonstrates strong empirical performance with linear and neural network function approximation.

Abstract

This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. We begin with the minimax formulation of the problem and then outline how to leverage classical tools from optimization, in particular, the proximal-point method (PPM) and dual smoothing, for online and offline IL, respectively. Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature. In particular, we do away with the conventional alternating updates by the optimization of a single convex and smooth objective over both cost and Q-functions. When solved inexactly, we relate the optimization errors to the suboptimality of the recovered policy. As an added bonus, by re-interpreting PPM as dual smoothing with the expert policy as a center…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Proximal Point Imitation Learning· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Model Reduction and Neural Networks · Reinforcement Learning in Robotics