IQ-Learn: Inverse soft-Q Learning for Imitation
Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Matthieu, Geist, Stefano Ermon

TL;DR
IQ-Learn introduces a dynamics-aware imitation learning method that learns a single Q-function to implicitly represent reward and policy, outperforming existing methods in high-dimensional environments without adversarial training.
Contribution
The paper presents IQ-Learn, a novel inverse soft-Q learning approach that avoids adversarial training and effectively utilizes environment dynamics for imitation learning.
Findings
Achieves state-of-the-art results in offline and online imitation learning.
Significantly reduces environment interactions needed compared to prior methods.
Scales effectively to high-dimensional environments.
Abstract
In many sequential decision-making problems (e.g., robotics control, game playing, sequential prediction), human or expert data is available containing useful information about the task. However, imitation learning (IL) from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics. Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence but doesn't utilize any information involving the environment's dynamics. Many existing methods that exploit dynamics information are difficult to train in practice due to an adversarial optimization process over reward and policy approximators or biased, high variance gradient estimators. We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function, implicitly representing both reward and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsInverse Q-Learning
