Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning

Abed AlRahman Al Makdah; Oliver Kosut; Lalitha Sankar; Shaofeng Zou

arXiv:2508.17185·math.OC·August 26, 2025

Linear Dynamics meets Linear MDPs: Closed-Form Optimal Policies via Reinforcement Learning

Abed AlRahman Al Makdah, Oliver Kosut, Lalitha Sankar, Shaofeng Zou

PDF

TL;DR

This paper introduces a novel reinforcement learning method for linear dynamical systems that combines classical control theory with linear MDPs, providing explicit optimal policies without transition probability estimation.

Contribution

It derives a closed-form optimal policy for linear systems with stochastic components, integrating LQR and linear MDP frameworks for simplicity and robustness.

Findings

01

Explicit parametric form of the optimal policy derived.

02

Theoretical guarantees on system stability under the learned policy.

03

Sample complexity analysis demonstrating convergence to optimal control.

Abstract

Many applications -- including power systems, robotics, and economics -- involve a dynamical system interacting with a stochastic and hard-to-model environment. We adopt a reinforcement learning approach to control such systems. Specifically, we consider a deterministic, discrete-time, linear, time-invariant dynamical system coupled with a feature-based linear Markov process with an unknown transition kernel. The objective is to learn a control policy that optimizes a quadratic cost over the system state, the Markov process, and the control input. Leveraging both components of the system, we derive an explicit parametric form for the optimal state-action value function and the corresponding optimal policy. Our model is distinct in combining aspects of both classical Linear Quadratic Regulator (LQR) and linear Markov decision process (MDP) frameworks. This combination retains the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.