Virtual Replay Cache
Brett Daley, Christopher Amato

TL;DR
The paper introduces the Virtual Replay Cache, a new data structure that reduces memory usage and improves efficiency in return caching for deep reinforcement learning, demonstrated on Atari games.
Contribution
It proposes the Virtual Replay Cache (VRC), addressing memory and efficiency issues in return caching for reinforcement learning.
Findings
VRC nearly eliminates DQN(λ)'s cache memory footprint.
VRC slightly reduces total training time on Atari games.
Demonstrates improved efficiency in deep RL training.
Abstract
Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e.g. the {\lambda}-return) for deep reinforcement learning. By precomputing return estimates in sequential batches and then storing the results in an auxiliary data structure for later sampling, the average computation spent per estimate can be greatly reduced. Still, the efficiency of return caching could be improved, particularly with regard to its large memory usage and repetitive data copies. We propose a new data structure, the Virtual Replay Cache (VRC), to address these shortcomings. When learning to play Atari 2600 games, the VRC nearly eliminates DQN({\lambda})'s cache memory footprint and slightly reduces the total training time on our hardware.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Parallel Computing and Optimization Techniques
