Data-Based Efficient Off-Policy Stabilizing Optimal Control Algorithms   for Discrete-Time Linear Systems via Damping Coefficients

Dongdong Li; Jiuxiang Dong

arXiv:2412.20845·eess.SY·March 20, 2025

Data-Based Efficient Off-Policy Stabilizing Optimal Control Algorithms for Discrete-Time Linear Systems via Damping Coefficients

Dongdong Li, Jiuxiang Dong

PDF

Open Access

TL;DR

This paper introduces two model-free reinforcement learning algorithms for discrete-time linear systems that use damping coefficients to achieve stable, off-policy optimal control without requiring an initial stabilizing control, validated through simulations.

Contribution

The paper proposes novel off-policy RL algorithms based on damping coefficients that enable finite-step stabilization and data-driven control for unknown linear systems.

Findings

01

Algorithms converge rapidly, similar to traditional policy iteration.

02

No need for initial stabilizing control, simplifying implementation.

03

Validated effectiveness through simulation results.

Abstract

Policy iteration is one of the classical frameworks of reinforcement learning, which requires a known initial stabilizing control. However, finding the initial stabilizing control depends on the known system model. To relax this requirement and achieve model-free optimal control, in this paper, two different reinforcement learning algorithms based on policy iteration and variable damping coefficients are designed for unknown discrete-time linear systems. First, a stable artificial system is designed, and this system is gradually iterated to the original system by varying the damping coefficients. This allows the initial stabilizing control to be obtained in a finite number of iteration steps. Then, an off-policy iteration algorithm and an off-policy $Q$ -learning algorithm are designed to select the appropriate damping coefficients and realize data-driven. In these two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Frequency Control in Power Systems · Stability and Control of Uncertain Systems