Reinforcement Learning for a Discrete-Time Linear-Quadratic Control   Problem with an Application

Lucky Li

arXiv:2412.05906·stat.ML·February 5, 2025

Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application

Lucky Li

PDF

Open Access

TL;DR

This paper applies reinforcement learning to a discrete-time linear-quadratic control problem, demonstrating Gaussian optimal policies, and extends the approach to a financial asset-liability management application with proven convergence.

Contribution

It introduces a reinforcement learning framework for discrete-time LQ control, proving Gaussian optimal policies and applying the method to financial management with convergence guarantees.

Findings

01

Optimal policies are Gaussian in the RL framework.

02

The RL algorithm converges and improves policies in the financial application.

03

Numerical simulations validate the theoretical results.

Abstract

We study the discrete-time linear-quadratic (LQ) control model using reinforcement learning (RL). Using entropy to measure the cost of exploration, we prove that the optimal feedback policy for the problem must be Gaussian type. Then, we apply the results of the discrete-time LQ model to solve the discrete-time mean-variance asset-liability management problem and prove our RL algorithm's policy improvement and convergence. Finally, a numerical example sheds light on the theoretical results established using simulations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control