Analyzing the Impact of Computation in Adaptive Dynamic Programming for   Stochastic LQR Problem

Wenhan Cao; Alexandre Capone; Sandra Hirche; Wei Pan

arXiv:2402.09575·math.OC·February 16, 2024·1 cites

Analyzing the Impact of Computation in Adaptive Dynamic Programming for Stochastic LQR Problem

Wenhan Cao, Alexandre Capone, Sandra Hirche, Wei Pan

PDF

Open Access

TL;DR

This paper investigates how the sampling period and computational errors in adaptive dynamic programming affect control performance in stochastic LQR problems, revealing a linear convergence rate and validating findings with a sensorimotor task.

Contribution

It establishes the impact of sampling period on ADP convergence and control quality, linking computational errors to convergence behavior in stochastic LQR.

Findings

01

Convergence rate is O(h) with respect to sampling period h.

02

Sampling period significantly influences control performance.

03

Theoretical results validated by sensorimotor control experiment.

Abstract

Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a critical phenomenon: the sampling period can significantly impact control performance. This impact is due to the fact that computational errors introduced in each step of PI can significantly affect the algorithm's convergence behavior, which in turn influences the resulting control policy. We draw a parallel between PI and Newton's method applied to the Ricatti equation to elucidate how the computation impacts control. In this light, the computational error in each PI step manifests itself as an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Electric Power System Optimization · Energy Load and Power Forecasting