A Scalable Finite Difference Method for Deep Reinforcement Learning

Matthew Allen; John Raisbeck; and Hakho Lee

arXiv:2210.07487·cs.LG·January 20, 2023

A Scalable Finite Difference Method for Deep Reinforcement Learning

Matthew Allen, John Raisbeck, and Hakho Lee

PDF

Open Access

TL;DR

This paper introduces a scalable finite difference method for deep reinforcement learning that efficiently utilizes older data, reducing idle time and computational waste in distributed settings.

Contribution

A novel finite difference algorithm that leverages older data to improve scalability and efficiency in distributed deep reinforcement learning.

Findings

01

Reduces idle time in distributed algorithms

02

Maintains performance comparable to existing methods

03

Improves computational efficiency in large-scale setups

Abstract

Several low-bandwidth distributable black-box optimization algorithms in the family of finite differences such as Evolution Strategies have recently been shown to perform nearly as well as tailored Reinforcement Learning methods in some Reinforcement Learning domains. One shortcoming of these black-box methods is that they must collect information about the structure of the return function at every update, and can often employ only information drawn from a distribution centered around the current parameters. As a result, when these algorithms are distributed across many machines, a significant portion of total runtime may be spent with many machines idle, waiting for a final return and then for an update to be calculated. In this work we introduce a novel method to use older data in finite difference algorithms, which produces a scalable algorithm that avoids significant idle time or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research · Protein Degradation and Inhibitors

MethodsAdam