Bayesian Policy Search for Stochastic Domains

David Tolpin; Yuan Zhou; Hongseok Yang

arXiv:2010.00284·cs.LG·October 2, 2020

Bayesian Policy Search for Stochastic Domains

David Tolpin, Yuan Zhou, Hongseok Yang

PDF

Open Access

TL;DR

This paper introduces a Bayesian inference approach for policy search in stochastic domains using nested probabilistic programs and adapts Lightweight Metropolis-Hastings for robust inference, demonstrating effective policy learning.

Contribution

It presents a novel Bayesian formulation for policy search in stochastic domains with nested conditioning and adapts LMH for this setting, broadening inference capabilities.

Findings

01

Policies of similar quality are learned with the new scheme

02

The adapted LMH is simpler and more general

03

The approach is applicable to a wider class of probabilistic programs

Abstract

AI planning can be cast as inference in probabilistic models, and probabilistic programming was shown to be capable of policy search in partially observable domains. Prior work introduces policy search through Markov chain Monte Carlo in deterministic domains, as well as adapts black-box variational inference to stochastic domains, however not in the strictly Bayesian sense. In this work, we cast policy search in stochastic domains as a Bayesian inference problem and provide a scheme for encoding such problems as nested probabilistic programs. We argue that probabilistic programs for policy search in stochastic domains should involve nested conditioning, and provide an adaption of Lightweight Metropolis-Hastings (LMH) for robust inference in such programs. We apply the proposed scheme to stochastic domains and show that policies of similar quality are learned, despite a simpler and more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Gaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics