Bellman operator convergence enhancements in reinforcement learning algorithms
David Krame Kadurha, Domini Jocema Leko Moutouo, Yae Ulrich Gaba

TL;DR
This paper explores the mathematical foundations of reinforcement learning, focusing on Bellman operators and their convergence properties, and demonstrates how alternative formulations can improve algorithm efficiency in standard environments.
Contribution
It introduces new formulations of Bellman operators based on topological and metric space concepts, enhancing convergence rates in RL algorithms.
Findings
Alternative Bellman operators improve convergence speed
Mathematical insights lead to more efficient RL algorithms
Successful application in standard RL benchmarks
Abstract
This paper reviews the topological groundwork for the study of reinforcement learning (RL) by focusing on the structure of state, action, and policy spaces. We begin by recalling key mathematical concepts such as complete metric spaces, which form the foundation for expressing RL problems. By leveraging the Banach contraction principle, we illustrate how the Banach fixed-point theorem explains the convergence of RL algorithms and how Bellman operators, expressed as operators on Banach spaces, ensure this convergence. The work serves as a bridge between theoretical mathematics and practical algorithm design, offering new approaches to enhance the efficiency of RL. In particular, we investigate alternative formulations of Bellman operators and demonstrate their impact on improving convergence rates and performance in standard RL environments such as MountainCar, CartPole, and Acrobot. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Control and Stability of Dynamical Systems
