Parallel and Flexible Dynamic Programming via the Randomized Mini-Batch Operator
Matilde Gargiani, Andrea Martinelli, Max Ruts Martinez, John Lygeros

TL;DR
This paper introduces a randomized mini-batch operator for dynamic programming that balances convergence speed and parallelization, enabling more flexible and efficient solutions for Markov decision processes.
Contribution
A novel randomized mini-batch operator for DP that combines the convergence benefits of Gauss-Seidel with the parallelism of Bellman, validated through theoretical analysis and extensive experiments.
Findings
The new operator converges faster than Bellman-based methods.
It offers better parallelization than Gauss-Seidel-based methods.
Performance adapts well to different MDP structures and hardware setups.
Abstract
The Bellman operator constitutes the foundation of dynamic programming (DP). An alternative is presented by the Gauss-Seidel operator, whose evaluation, differently from that of the Bellman operator where the states are all processed at once, updates one state at a time, while incorporating into the computation the interim results. The provably better convergence rate of DP methods based on the Gauss-Seidel operator comes at the price of an inherent sequentiality, which prevents the exploitation of modern multi-core systems. In this work we propose a new operator for dynamic programming, namely, the randomized mini-batch operator, which aims at realizing the trade-off between the better convergence rate of the methods based on the Gauss-Seidel operator and the parallelization capability offered by the Bellman operator. After the introduction of the new operator, a theoretical analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and ELM
