Bayesian learning of noisy Markov decision processes

Sumeetpal S. Singh; Nicolas Chopin; Nick Whiteley

arXiv:1211.5901·stat.ML·November 27, 2012

Bayesian learning of noisy Markov decision processes

Sumeetpal S. Singh, Nicolas Chopin, Nick Whiteley

PDF

Open Access

TL;DR

This paper introduces a Bayesian framework for inverse reinforcement learning in noisy Markov decision processes, utilizing a new MCMC sampler with parameter expansion to effectively learn and predict controllers from data.

Contribution

It presents a novel Bayesian model for inverse reinforcement learning in noisy MDPs and develops an efficient MCMC sampling method with parameter expansion.

Findings

01

Effective Bayesian inference for inverse RL in noisy MDPs

02

New MCMC sampler with improved convergence properties

03

Successful application to learning a human controller

Abstract

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Control Systems and Identification · Reinforcement Learning in Robotics