Bayes-CPACE: PAC Optimal Exploration in Continuous Space Bayes-Adaptive   Markov Decision Processes

Gilwoo Lee; Sanjiban Choudhury; Brian Hou; Siddhartha S. Srinivasa

arXiv:1810.03048·cs.LG·October 9, 2018

Bayes-CPACE: PAC Optimal Exploration in Continuous Space Bayes-Adaptive Markov Decision Processes

Gilwoo Lee, Sanjiban Choudhury, Brian Hou, Siddhartha S. Srinivasa

PDF

Open Access

TL;DR

This paper introduces Bayes-CPACE, the first PAC optimal algorithm for continuous-space BAMDPs, using sampling and Lipschitz continuity to efficiently approximate optimal policies under model uncertainty.

Contribution

It presents a novel PAC optimal algorithm for continuous BAMDPs that leverages sampling and Lipschitz properties to handle intractability.

Findings

01

Algorithm is proven to be near-optimal.

02

Empirical results show competitive performance.

03

Efficient schemes improve computational feasibility.

Abstract

We present the first PAC optimal algorithm for Bayes-Adaptive Markov Decision Processes (BAMDPs) in continuous state and action spaces, to the best of our knowledge. The BAMDP framework elegantly addresses model uncertainty by incorporating Bayesian belief updates into long-term expected return. However, computing an exact optimal Bayesian policy is intractable. Our key insight is to compute a near-optimal value function by covering the continuous state-belief-action space with a finite set of representative samples and exploiting the Lipschitz continuity of the value function. We prove the near-optimality of our algorithm and analyze a number of schemes that boost the algorithm's efficiency. Finally, we empirically validate our approach on a number of discrete and continuous BAMDPs and show that the learned policy has consistently competitive performance against baseline approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · AI-based Problem Solving and Planning