A Modified Adaptive Data-Enabled Policy Optimization Control to Resolve State Perturbations
Mojtaba Kaheni, Niklas Persson, Vittorio De Iuliis, Costanzo Manes, Alessandro V. Papadopoulos

TL;DR
This paper introduces PFDeePO, a modified data-driven control algorithm that reduces state perturbations by avoiding probing noise, ensuring stability and improved convergence without relying on external noise inputs.
Contribution
The paper presents a novel perturbation-free modification to DeePO, enabling stable, noise-free convergence by pausing updates near equilibrium and applying scaled multiplicative noise.
Findings
PFDeePO effectively eliminates state perturbations in simulations.
The modified algorithm maintains system stability and performance.
It reduces reliance on probing noise for persistent excitation.
Abstract
This paper proposes modifications to the data-enabled policy optimization (DeePO) algorithm to mitigate state perturbations. DeePO is an adaptive, data-driven approach designed to iteratively compute a feedback gain equivalent to the certainty-equivalence LQR gain. Like other data-driven approaches based on Willems' fundamental lemma, DeePO requires persistently exciting input signals. However, linear state-feedback gains from LQR designs cannot inherently produce such inputs. To address this, probing noise is conventionally added to the control signal to ensure persistent excitation. However, the added noise may induce undesirable state perturbations. We first identify two key issues that jeopardize the desired performance of DeePO when probing noise is not added: the convergence of states to the equilibrium point, and the convergence of the controller to its optimal value. To address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
