Convex Programs and Lyapunov Functions for Reinforcement Learning: A   Unified Perspective on the Analysis of Value-Based Methods

Xingang Guo; Bin Hu

arXiv:2202.06922·math.OC·February 15, 2022

Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods

Xingang Guo, Bin Hu

PDF

Open Access

TL;DR

This paper introduces a unified control-theoretic framework for analyzing value-based reinforcement learning methods using convex programs and Lyapunov functions, revealing deep connections with control systems.

Contribution

It presents a novel approach that leverages convex control theory to analyze and derive convergence results for value-based RL algorithms, bridging RL and control theory.

Findings

01

Convex testing conditions can be used to analyze RL algorithms.

02

Lyapunov functions can be constructed via convex programs.

03

Connections between feedback control and RL algorithms are established.

Abstract

Value-based methods play a fundamental role in Markov decision processes (MDPs) and reinforcement learning (RL). In this paper, we present a unified control-theoretic framework for analyzing valued-based methods such as value computation (VC), value iteration (VI), and temporal difference (TD) learning (with linear function approximation). Built upon an intrinsic connection between value-based methods and dynamic systems, we can directly use existing convex testing conditions in control theory to derive various convergence results for the aforementioned value-based methods. These testing conditions are convex programs in form of either linear programming (LP) or semidefinite programming (SDP), and can be solved to construct Lyapunov functions in a straightforward manner. Our analysis reveals some intriguing connections between feedback control systems and RL algorithms. It is our hope…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealth Systems, Economic Evaluations, Quality of Life