On the Robustness of Derivative-free Methods for Linear Quadratic Regulator
Weijian Li, Panagiotis Kounatidis, Zhong-Ping Jiang, Andreas A. Malikopoulos

TL;DR
This paper analyzes the robustness of derivative-free policy optimization methods for linear quadratic regulator problems, providing bounds on perturbations and convergence guarantees under noisy conditions.
Contribution
It characterizes how small perturbations affect the convergence of derivative-free methods in LQR, offering explicit bounds and sample complexity analysis.
Findings
Derivative-free methods converge near the optimal policy under small perturbations.
Explicit bounds on perturbations necessary for convergence are established.
Sample complexity for perturbed methods is derived.
Abstract
Policy optimization has drawn increasing attention in reinforcement learning, particularly in the context of derivative-free methods for linear quadratic regulator (LQR) problems with unknown dynamics. This paper focuses on characterizing the robustness of derivative-free methods for solving an infinite-horizon LQR problem. To be specific, we estimate policy gradients by cost values, and study the effect of perturbations on the estimations, where the perturbations may arise from function approximations, measurement noises, etc. We show that under sufficiently small perturbations, the derivative-free methods converge to any pre-specified neighborhood of the optimal policy. Furthermore, we establish explicit bounds on the perturbations, and provide the sample complexity for the perturbed derivative-free methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Control Systems and Identification
