Probability Density Estimation Based Imitation Learning

Yang Liu; Yongzhe Chang; Shilei Jiang; Xueqian Wang; Bin Liang; Bo; Yuan

arXiv:2112.06746·cs.LG·December 14, 2021

Probability Density Estimation Based Imitation Learning

Yang Liu, Yongzhe Chang, Shilei Jiang, Xueqian Wang, Bin Liang, Bo, Yuan

PDF

Open Access

TL;DR

This paper introduces a novel IRL method using probability density estimation to simplify reward function learning, achieving efficient and accurate policy recovery in both discrete and continuous action spaces.

Contribution

It proposes a new reward function based on density estimation, transforming IRL into a density estimation problem, and presents a practical framework called PDEIL.

Findings

01

PDEIL outperforms existing algorithms in reward recovery accuracy.

02

The method is effective in both discrete and continuous action spaces.

03

Theoretically, the optimal policy matches the expert policy for deterministic cases.

Abstract

Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between agents and environments. It does not require explicit reward signals and instead tries to recover desired policies using expert demonstrations. In general, IL methods can be categorized into Behavioral Cloning (BC) and Inverse Reinforcement Learning (IRL). In this work, a novel reward function based on probability density estimation is proposed for IRL, which can significantly reduce the complexity of existing IRL methods. Furthermore, we prove that the theoretically optimal policy derived from our reward function is identical to the expert policy as long as it is deterministic. Consequently, an IRL problem can be gracefully transformed into a probability density estimation problem. Based on the proposed reward function, we present a "watch-try-learn" style framework named Probability Density…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics