Dual Control for Approximate Bayesian Reinforcement Learning
Edgar D. Klenske, Philipp Hennig

TL;DR
This paper extends dual control methods from control theory to approximate Bayesian reinforcement learning using modern regression techniques, enabling structured exploration in complex dynamical systems.
Contribution
It introduces a novel framework combining dual control with generalized linear regression for Bayesian RL, providing practical approximations and exploration strategies.
Findings
Framework offers useful approximation to intractable Bayesian RL
Produces structured exploration strategies different from standard RL
Applicable to Gaussian process regression and neural networks
Abstract
Control of non-episodic, finite-horizon dynamical systems with uncertain dynamics poses a tough and elementary case of the exploration-exploitation trade-off. Bayesian reinforcement learning, reasoning about the effect of actions and future observations, offers a principled solution, but is intractable. We review, then extend an old approximate approach from control theory---where the problem is known as dual control---in the context of modern regression methods, specifically generalized linear regression. Experiments on simulated systems show that this framework offers a useful approximation to the intractable aspects of Bayesian RL, producing structured exploration strategies that differ from standard RL approaches. We provide simple examples for the use of this framework in (approximate) Gaussian process regression and feedforward neural networks for the control of exploration.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Advanced Control Systems Optimization
MethodsGaussian Process
