Weighted Maximum Entropy Inverse Reinforcement Learning

The Viet Bui; Tien Mai; Patrick Jaillet

arXiv:2208.09611·cs.LG·August 23, 2022

Weighted Maximum Entropy Inverse Reinforcement Learning

The Viet Bui, Tien Mai, Patrick Jaillet

PDF

Open Access

TL;DR

This paper introduces a weighted maximum entropy approach to inverse reinforcement learning that captures expert stochasticity, improving reward and policy recovery from demonstrations in various tasks.

Contribution

It proposes a novel weighted maximum entropy framework that learns both reward functions and entropy structures, enhancing IRL and IM performance.

Findings

01

Outperforms prior algorithms in human and simulated demonstrations

02

Effective in discrete and continuous IRL/IM tasks

03

Captures expert stochasticity and bounded rationality

Abstract

We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a weight function to the maximum entropy framework, with the motivation of having the ability to learn and recover the stochasticity (or the bounded rationality) of the expert policy. Our framework and algorithms allow to learn both a reward (or policy) function and the structure of the entropy terms added to the Markov Decision Processes, thus enhancing the learning procedure. Our numerical experiments using human and simulated demonstrations and with discrete and continuous IRL/IM tasks show that our approach outperforms prior algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function