Simultaneous active parameter estimation and control using   sampling-based Bayesian reinforcement learning

Patrick Slade; Preston Culbertson; Zachary Sunberg; and Mykel; Kochenderfer

arXiv:1707.09055·cs.SY·July 31, 2017

Simultaneous active parameter estimation and control using sampling-based Bayesian reinforcement learning

Patrick Slade, Preston Culbertson, Zachary Sunberg, and Mykel, Kochenderfer

PDF

TL;DR

This paper introduces a method combining Bayesian reinforcement learning with Monte Carlo tree search and Kalman filtering to enable robots to simultaneously estimate their parameters and control their actions under uncertainty.

Contribution

It presents a novel approach that frames simultaneous estimation and control as a Bayes-adaptive MDP and solves it online with MCTS and Kalman filtering, improving robustness in manipulation tasks.

Findings

01

MCTS effectively reduces model uncertainty during control.

02

The approach outperforms certainty equivalent MPC in simulations.

03

Method handles Gaussian noise and parameter uncertainty robustly.

Abstract

Robots performing manipulation tasks must operate under uncertainty about both their pose and the dynamics of the system. In order to remain robust to modeling error and shifts in payload dynamics, agents must simultaneously perform estimation and control tasks. However, the optimal estimation actions are often not the optimal actions for accomplishing the control tasks, and thus agents trade between exploration and exploitation. This work frames the problem as a Bayes-adaptive Markov decision process and solves it online using Monte Carlo tree search and an extended Kalman filter to handle Gaussian process noise and parameter uncertainty in a continuous space. MCTS selects control actions to reduce model uncertainty and reach the goal state nearly optimally. Certainty equivalent model predictive control is used as a benchmark to compare performance in simulations with varying process…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.