A Scalable Finite Difference Method for Deep Reinforcement Learning
Matthew Allen, John Raisbeck, and Hakho Lee

TL;DR
This paper introduces a scalable finite difference method for deep reinforcement learning that efficiently utilizes older data, reducing idle time and computational waste in distributed settings.
Contribution
A novel finite difference algorithm that leverages older data to improve scalability and efficiency in distributed deep reinforcement learning.
Findings
Reduces idle time in distributed algorithms
Maintains performance comparable to existing methods
Improves computational efficiency in large-scale setups
Abstract
Several low-bandwidth distributable black-box optimization algorithms in the family of finite differences such as Evolution Strategies have recently been shown to perform nearly as well as tailored Reinforcement Learning methods in some Reinforcement Learning domains. One shortcoming of these black-box methods is that they must collect information about the structure of the return function at every update, and can often employ only information drawn from a distribution centered around the current parameters. As a result, when these algorithms are distributed across many machines, a significant portion of total runtime may be spent with many machines idle, waiting for a final return and then for an update to be calculated. In this work we introduce a novel method to use older data in finite difference algorithms, which produces a scalable algorithm that avoids significant idle time or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research · Protein Degradation and Inhibitors
MethodsAdam
