Bayesian learning of noisy Markov decision processes
Sumeetpal S. Singh, Nicolas Chopin, Nick Whiteley

TL;DR
This paper introduces a Bayesian framework for inverse reinforcement learning in noisy Markov decision processes, utilizing a new MCMC sampler with parameter expansion to effectively learn and predict controllers from data.
Contribution
It presents a novel Bayesian model for inverse reinforcement learning in noisy MDPs and develops an efficient MCMC sampling method with parameter expansion.
Findings
Effective Bayesian inference for inverse RL in noisy MDPs
New MCMC sampler with improved convergence properties
Successful application to learning a human controller
Abstract
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Control Systems and Identification · Reinforcement Learning in Robotics
