Probability Density Estimation Based Imitation Learning
Yang Liu, Yongzhe Chang, Shilei Jiang, Xueqian Wang, Bin Liang, Bo, Yuan

TL;DR
This paper introduces a novel IRL method using probability density estimation to simplify reward function learning, achieving efficient and accurate policy recovery in both discrete and continuous action spaces.
Contribution
It proposes a new reward function based on density estimation, transforming IRL into a density estimation problem, and presents a practical framework called PDEIL.
Findings
PDEIL outperforms existing algorithms in reward recovery accuracy.
The method is effective in both discrete and continuous action spaces.
Theoretically, the optimal policy matches the expert policy for deterministic cases.
Abstract
Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between agents and environments. It does not require explicit reward signals and instead tries to recover desired policies using expert demonstrations. In general, IL methods can be categorized into Behavioral Cloning (BC) and Inverse Reinforcement Learning (IRL). In this work, a novel reward function based on probability density estimation is proposed for IRL, which can significantly reduce the complexity of existing IRL methods. Furthermore, we prove that the theoretically optimal policy derived from our reward function is identical to the expert policy as long as it is deterministic. Consequently, an IRL problem can be gracefully transformed into a probability density estimation problem. Based on the proposed reward function, we present a "watch-try-learn" style framework named Probability Density…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
