Task-Guided Inverse Reinforcement Learning Under Partial Information

Franck Djeumou; Murat Cubuktepe; Craig Lennon; Ufuk Topcu

arXiv:2105.14073·cs.LG·December 20, 2021

Task-Guided Inverse Reinforcement Learning Under Partial Information

Franck Djeumou, Murat Cubuktepe, Craig Lennon, Ufuk Topcu

PDF

Open Access

TL;DR

This paper presents a novel IRL algorithm for POMDPs that leverages causal entropy and temporal logic specifications to recover reward functions and policies from limited data, addressing information asymmetry and scalability.

Contribution

It introduces a scalable IRL method for POMDPs that incorporates task specifications and memory, overcoming limitations of existing techniques under partial observability.

Findings

01

Effective reward and policy recovery with limited data.

02

Incorporation of temporal logic reduces information asymmetry.

03

Memory-enhanced policies outperform memoryless ones.

Abstract

We study the problem of inverse reinforcement learning (IRL), where the learning agent recovers a reward function using expert demonstrations. Most of the existing IRL techniques make the often unrealistic assumption that the agent has access to full information about the environment. We remove this assumption by developing an algorithm for IRL in partially observable Markov decision processes (POMDPs). The algorithm addresses several limitations of existing techniques that do not take the information asymmetry between the expert and the learner into account. First, it adopts causal entropy as the measure of the likelihood of the expert demonstrations as opposed to entropy in most existing IRL techniques, and avoids a common source of algorithmic complexity. Second, it incorporates task specifications expressed in temporal logic into IRL. Such specifications may be interpreted as side…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function