Weighted Maximum Entropy Inverse Reinforcement Learning
The Viet Bui, Tien Mai, Patrick Jaillet

TL;DR
This paper introduces a weighted maximum entropy approach to inverse reinforcement learning that captures expert stochasticity, improving reward and policy recovery from demonstrations in various tasks.
Contribution
It proposes a novel weighted maximum entropy framework that learns both reward functions and entropy structures, enhancing IRL and IM performance.
Findings
Outperforms prior algorithms in human and simulated demonstrations
Effective in discrete and continuous IRL/IM tasks
Captures expert stochasticity and bounded rationality
Abstract
We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a weight function to the maximum entropy framework, with the motivation of having the ability to learn and recover the stochasticity (or the bounded rationality) of the expert policy. Our framework and algorithms allow to learn both a reward (or policy) function and the structure of the entropy terms added to the Markov Decision Processes, thus enhancing the learning procedure. Our numerical experiments using human and simulated demonstrations and with discrete and continuous IRL/IM tasks show that our approach outperforms prior algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function
