A General Family of Robust Stochastic Operators for Reinforcement Learning
Yingdong Lu, Mark S. Squillante, Chai Wah Wu

TL;DR
This paper introduces a new family of stochastic operators for reinforcement learning that enhances robustness to errors, preserves optimality, and improves performance over classical methods, supported by theoretical and empirical evidence.
Contribution
It proposes a novel family of stochastic operators that improve robustness and action gap in reinforcement learning, with proven theoretical properties and superior empirical results.
Findings
Operators preserve optimality on sample paths
They increase the action gap effectively
Empirical results outperform classical Bellman operator
Abstract
We consider a new family of operators for reinforcement learning with the goal of alleviating the negative effects and becoming more robust to approximation or estimation errors. Various theoretical results are established, which include showing on a sample path basis that our family of operators preserve optimality and increase the action gap. Our empirical results illustrate the strong benefits of our family of operators, significantly outperforming the classical Bellman operator and recently proposed operators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Adaptive Dynamic Programming Control
