A Coupled Flow Approach to Imitation Learning

Gideon Freund; Elad Sarafian; Sarit Kraus

arXiv:2305.00303·cs.LG·May 2, 2023·2 cites

A Coupled Flow Approach to Imitation Learning

Gideon Freund, Elad Sarafian, Sarit Kraus

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces CFIL, a novel imitation learning method that explicitly models state distributions using coupled normalizing flows, achieving state-of-the-art results with minimal expert data.

Contribution

It proposes a new coupled flow model for explicit state distribution estimation in imitation learning, extending to various data regimes.

Findings

01

Achieves state-of-the-art performance on benchmark tasks.

02

Effective with only a single expert trajectory.

03

Extends to subsampled and state-only settings.

Abstract

In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the policy. It plays a crucial role in the policy gradient theorem, and references to it--along with the related state-action distribution--can be found all across the literature. Despite its importance, the state distribution is mostly discussed indirectly and theoretically, rather than being modeled explicitly. The reason being an absence of appropriate density estimation tools. In this work, we investigate applications of a normalizing flow-based model for the aforementioned distributions. In particular, we use a pair of flows coupled through the optimality point of the Donsker-Varadhan representation of the Kullback-Leibler (KL) divergence, for distribution matching based imitation learning. Our algorithm, Coupled Flow Imitation Learning (CFIL), achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

A Coupled Flow Approach to Imitation Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Domain Adaptation and Few-Shot Learning