CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem
Arpan Kusari

TL;DR
This paper introduces CWAE-IRL, a novel supervised method using variational inference and Wasserstein auto-encoders to infer reward functions in IRL without requiring system dynamics knowledge, demonstrated on benchmark tasks.
Contribution
It proposes a new IRL approach combining CVAE with Wasserstein loss, enabling reward inference in complex environments without system dynamics.
Findings
Effectively learns latent reward functions in high-dimensional environments
Outperforms previous IRL methods on objectworld and pendulum benchmarks
Does not require prior knowledge of system dynamics
Abstract
Inverse reinforcement learning (IRL) is used to infer the reward function from the actions of an expert running a Markov Decision Process (MDP). A novel approach using variational inference for learning the reward function is proposed in this research. Using this technique, the intractable posterior distribution of the continuous latent variable (the reward function in this case) is analytically approximated to appear to be as close to the prior belief while trying to reconstruct the future state conditioned on the current state and action. The reward function is derived using a well-known deep generative model known as Conditional Variational Auto-encoder (CVAE) with Wasserstein loss function, thus referred to as Conditional Wasserstein Auto-encoder-IRL (CWAE-IRL), which can be analyzed as a combination of the backward and forward inference. This can then form an efficient alternative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Adaptive Dynamic Programming Control
