Dual Control for Approximate Bayesian Reinforcement Learning

Edgar D. Klenske; Philipp Hennig

arXiv:1510.03591·stat.ML·August 12, 2016·2 cites

Dual Control for Approximate Bayesian Reinforcement Learning

Edgar D. Klenske, Philipp Hennig

PDF

Open Access

TL;DR

This paper extends dual control methods from control theory to approximate Bayesian reinforcement learning using modern regression techniques, enabling structured exploration in complex dynamical systems.

Contribution

It introduces a novel framework combining dual control with generalized linear regression for Bayesian RL, providing practical approximations and exploration strategies.

Findings

01

Framework offers useful approximation to intractable Bayesian RL

02

Produces structured exploration strategies different from standard RL

03

Applicable to Gaussian process regression and neural networks

Abstract

Control of non-episodic, finite-horizon dynamical systems with uncertain dynamics poses a tough and elementary case of the exploration-exploitation trade-off. Bayesian reinforcement learning, reasoning about the effect of actions and future observations, offers a principled solution, but is intractable. We review, then extend an old approximate approach from control theory---where the problem is known as dual control---in the context of modern regression methods, specifically generalized linear regression. Experiments on simulated systems show that this framework offers a useful approximation to the intractable aspects of Bayesian RL, producing structured exploration strategies that differ from standard RL approaches. We provide simple examples for the use of this framework in (approximate) Gaussian process regression and feedforward neural networks for the control of exploration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Advanced Control Systems Optimization

MethodsGaussian Process