On the Guarantees of Minimizing Regret in Receding Horizon
Andrea Martin, Luca Furieri, Florian D\"orfler, John Lygeros,, Giancarlo Ferrari-Trecate

TL;DR
This paper introduces a receding horizon control scheme that minimizes regret to bridge optimal control and online learning, providing stability, safety, and efficiency guarantees.
Contribution
It presents the first receding horizon approach based on finite horizon regret-optimal policies with proven stability, safety, and computational efficiency.
Findings
Outperforms standard methods when disturbances are unpredictable.
Ensures recursive feasibility and bounded regret.
Efficient convex-concave programming for policy optimization.
Abstract
Towards bridging classical optimal control and online learning, regret minimization has recently been proposed as a control design criterion. This competitive paradigm penalizes the loss relative to the optimal control actions chosen by a clairvoyant policy, and allows tracking the optimal performance in hindsight no matter how disturbances are generated. In this paper, we propose the first receding horizon scheme based on the repeated computation of finite horizon regret-optimal policies, and we establish stability and safety guarantees for the resulting closed-loop system. Our derivations combine novel monotonicity properties of clairvoyant policies with suitable terminal ingredients. We prove that our scheme is recursively feasible, stabilizing, and that it achieves bounded regret relative to the infinite horizon clairvoyant policy. Last, we show that the policy optimization problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research
