CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement   Learning problem

Arpan Kusari

arXiv:1910.00584·cs.LG·October 3, 2019

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

Arpan Kusari

PDF

Open Access

TL;DR

This paper introduces CWAE-IRL, a novel supervised method using variational inference and Wasserstein auto-encoders to infer reward functions in IRL without requiring system dynamics knowledge, demonstrated on benchmark tasks.

Contribution

It proposes a new IRL approach combining CVAE with Wasserstein loss, enabling reward inference in complex environments without system dynamics.

Findings

01

Effectively learns latent reward functions in high-dimensional environments

02

Outperforms previous IRL methods on objectworld and pendulum benchmarks

03

Does not require prior knowledge of system dynamics

Abstract

Inverse reinforcement learning (IRL) is used to infer the reward function from the actions of an expert running a Markov Decision Process (MDP). A novel approach using variational inference for learning the reward function is proposed in this research. Using this technique, the intractable posterior distribution of the continuous latent variable (the reward function in this case) is analytically approximated to appear to be as close to the prior belief while trying to reconstruct the future state conditioned on the current state and action. The reward function is derived using a well-known deep generative model known as Conditional Variational Auto-encoder (CVAE) with Wasserstein loss function, thus referred to as Conditional Wasserstein Auto-encoder-IRL (CWAE-IRL), which can be analyzed as a combination of the backward and forward inference. This can then form an efficient alternative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Adaptive Dynamic Programming Control