Optimal Control with Learning on the Fly: System with Unknown Drift
Daniel Gurevich, Debdipta Goswami, Charles L. Fefferman, Clarence W., Rowley

TL;DR
This paper develops an optimal control method for stochastic systems with unknown drift, balancing learning and control to minimize regret over a finite horizon, and demonstrates the potential of Bayesian strategies for such problems.
Contribution
It introduces a finite-horizon control framework for systems with unknown parameters, deriving strategies that minimize worst-case regret using Bayesian approaches.
Findings
Derived control strategies that minimize worst-case regret.
Showed Bayesian strategies can be effective for unknown parameter control.
Compared performance with full-knowledge controllers to quantify regret.
Abstract
This paper derives an optimal control strategy for a simple stochastic dynamical system with constant drift and an additive control input. Motivated by the example of a physical system with an unexpected change in its dynamics, we take the drift parameter to be unknown, so that it must be learned while controlling the system. The state of the system is observed through a linear observation model with Gaussian noise. In contrast to most previous work, which focuses on a controller's asymptotic performance over an infinite time horizon, we minimize a quadratic cost function over a finite time horizon. The performance of our control strategy is quantified by comparing its cost with the cost incurred by an optimal controller that has full knowledge of the parameters. This approach gives rise to several notions of "regret." We derive a set of control strategies that provably minimize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Water resources management and optimization · Gaussian Processes and Bayesian Inference
