
TL;DR
This paper extends robust control theory to explicitly handle gradient uncertainty in value functions, formulating a new nonlinear PDE and proposing a practical algorithm to improve stability in reinforcement learning.
Contribution
It introduces the GU-HJBI equation for gradient uncertainty, proves its well-posedness, analyzes the LQ case, and develops the GURAC algorithm for robust control with gradient perturbations.
Findings
Classical quadratic value functions fail under gradient uncertainty.
The GURAC algorithm stabilizes training in reinforcement learning.
Theoretical analysis reveals fundamental changes in control law structure.
Abstract
We introduce a novel extension to robust control theory that explicitly addresses uncertainty in the value function's gradient, a form of uncertainty endemic to applications like reinforcement learning where value functions are approximated. We formulate a zero-sum dynamic game where an adversary perturbs both system dynamics and the value function gradient, leading to a new, highly nonlinear partial differential equation: the Hamilton-Jacobi-Bellman-Isaacs Equation with Gradient Uncertainty (GU-HJBI). We establish its well-posedness by proving a comparison principle for its viscosity solutions under a uniform ellipticity condition. Our analysis of the linear-quadratic (LQ) case yields a key insight: we prove that the classical quadratic value function assumption fails for any non-zero gradient uncertainty, fundamentally altering the problem structure. A formal perturbation analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
