Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU
Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan, Kautz

TL;DR
This paper presents a hybrid CPU/GPU implementation of the A3C reinforcement learning algorithm, significantly improving training speed by leveraging GPU computation and introducing a dynamic scheduling system.
Contribution
The paper introduces a novel hybrid CPU/GPU version of A3C with a dynamic scheduling strategy, enhancing computational efficiency and speed.
Findings
Achieved significant speedup over CPU implementation
Developed a system of queues and dynamic scheduling for asynchronous algorithms
Made the implementation publicly available for research use
Abstract
We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Parallel Computing and Optimization Techniques
MethodsEntropy Regularization · Convolution · Dense Connections · Softmax · A3C
