Estimation and Control Using Sampling-Based Bayesian Reinforcement   Learning

Patrick Slade; Zachary N. Sunberg; Mykel J. Kochenderfer

arXiv:1808.00888·cs.SY·August 3, 2018

Estimation and Control Using Sampling-Based Bayesian Reinforcement Learning

Patrick Slade, Zachary N. Sunberg, Mykel J. Kochenderfer

PDF

TL;DR

This paper presents a sampling-based Bayesian reinforcement learning approach for estimation and control in uncertain nonlinear systems, balancing exploration and exploitation to improve robustness and performance.

Contribution

It introduces an online Monte Carlo tree search method combined with an unscented Kalman filter for real-time decision-making under uncertainty in nonlinear systems.

Findings

01

Outperforms certainty equivalent model predictive control in simulations

02

Provides insights into when information gathering improves control performance

03

Uses offline optimization to tune Monte Carlo parameters effectively

Abstract

Real-world autonomous systems operate under uncertainty about both their pose and dynamics. Autonomous control systems must simultaneously perform estimation and control tasks to maintain robustness to changing dynamics or modeling errors. However, information gathering actions often conflict with optimal actions for reaching control objectives, requiring a trade-off between exploration and exploitation. The specific problem setting considered here is for discrete-time nonlinear systems, with process noise, input-constraints, and parameter uncertainty. This article frames this problem as a Bayes-adaptive Markov decision process and solves it online using Monte Carlo tree search with an unscented Kalman filter to account for process noise and parameter uncertainty. This method is compared with certainty equivalent model predictive control and a tree search method that approximates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.