Optimal Control with Learning on the Fly: System with Unknown Drift

Daniel Gurevich; Debdipta Goswami; Charles L. Fefferman; Clarence W.; Rowley

arXiv:2202.03620·math.OC·February 9, 2022

Optimal Control with Learning on the Fly: System with Unknown Drift

Daniel Gurevich, Debdipta Goswami, Charles L. Fefferman, Clarence W., Rowley

PDF

Open Access

TL;DR

This paper develops an optimal control method for stochastic systems with unknown drift, balancing learning and control to minimize regret over a finite horizon, and demonstrates the potential of Bayesian strategies for such problems.

Contribution

It introduces a finite-horizon control framework for systems with unknown parameters, deriving strategies that minimize worst-case regret using Bayesian approaches.

Findings

01

Derived control strategies that minimize worst-case regret.

02

Showed Bayesian strategies can be effective for unknown parameter control.

03

Compared performance with full-knowledge controllers to quantify regret.

Abstract

This paper derives an optimal control strategy for a simple stochastic dynamical system with constant drift and an additive control input. Motivated by the example of a physical system with an unexpected change in its dynamics, we take the drift parameter to be unknown, so that it must be learned while controlling the system. The state of the system is observed through a linear observation model with Gaussian noise. In contrast to most previous work, which focuses on a controller's asymptotic performance over an infinite time horizon, we minimize a quadratic cost function over a finite time horizon. The performance of our control strategy is quantified by comparing its cost with the cost incurred by an optimal controller that has full knowledge of the parameters. This approach gives rise to several notions of "regret." We derive a set of control strategies that provably minimize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Water resources management and optimization · Gaussian Processes and Bayesian Inference