IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg; Shuvam Chakraborty; Chris Cundy; Jiaming Song; Matthieu; Geist; Stefano Ermon

arXiv:2106.12142·cs.LG·November 4, 2022·1 cites

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Matthieu, Geist, Stefano Ermon

PDF

Open Access 5 Repos 1 Video

TL;DR

IQ-Learn introduces a dynamics-aware imitation learning method that learns a single Q-function to implicitly represent reward and policy, outperforming existing methods in high-dimensional environments without adversarial training.

Contribution

The paper presents IQ-Learn, a novel inverse soft-Q learning approach that avoids adversarial training and effectively utilizes environment dynamics for imitation learning.

Findings

01

Achieves state-of-the-art results in offline and online imitation learning.

02

Significantly reduces environment interactions needed compared to prior methods.

03

Scales effectively to high-dimensional environments.

Abstract

In many sequential decision-making problems (e.g., robotics control, game playing, sequential prediction), human or expert data is available containing useful information about the task. However, imitation learning (IL) from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics. Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence but doesn't utilize any information involving the environment's dynamics. Many existing methods that exploit dynamics information are difficult to train in practice due to an adversarial optimization process over reward and policy approximators or biased, high variance gradient estimators. We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function, implicitly representing both reward and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

IQ-Learn: Inverse soft-Q Learning for Imitation· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsInverse Q-Learning