A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms
Weiqin Chen, Mark S. Squillante, Chai Wah Wu, Santiago Paternain

TL;DR
This paper introduces a control-theoretic reinforcement learning framework that enhances policy learning, with proven convergence and optimality, and demonstrates superior empirical performance on classical tasks.
Contribution
It presents a novel control-theoretic approach with new theoretical results and an effective gradient ascent algorithm for reinforcement learning.
Findings
Proven convergence and optimality of the proposed approach.
Significant improvements in solution quality, sample complexity, and runtime.
Empirical validation on classical RL tasks showing superior performance.
Abstract
We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish various theoretical properties of our approach, such as convergence and optimality of our analog of the Bellman operator and Q-learning, a new control-policy-variable gradient theorem, and a specific gradient ascent algorithm based on this theorem within the context of a specific control-theoretic framework. We empirically evaluate the performance of our control theoretic approach on several classical reinforcement learning tasks, demonstrating significant improvements in solution quality, sample complexity, and running time of our approach over state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
