Virtual Replay Cache

Brett Daley; Christopher Amato

arXiv:2112.03421·cs.LG·December 8, 2021

Virtual Replay Cache

Brett Daley, Christopher Amato

PDF

Open Access 1 Repo

TL;DR

The paper introduces the Virtual Replay Cache, a new data structure that reduces memory usage and improves efficiency in return caching for deep reinforcement learning, demonstrated on Atari games.

Contribution

It proposes the Virtual Replay Cache (VRC), addressing memory and efficiency issues in return caching for reinforcement learning.

Findings

01

VRC nearly eliminates DQN(λ)'s cache memory footprint.

02

VRC slightly reduces total training time on Atari games.

03

Demonstrates improved efficiency in deep RL training.

Abstract

Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e.g. the {\lambda}-return) for deep reinforcement learning. By precomputing return estimates in sequential batches and then storing the results in an auxiliary data structure for later sampling, the average computation spent per estimate can be greatly reduced. Still, the efficiency of return caching could be improved, particularly with regard to its large memory usage and repetitive data copies. We propose a new data structure, the Virtual Replay Cache (VRC), to address these shortcomings. When learning to play Atari 2600 games, the VRC nearly eliminates DQN({\lambda})'s cache memory footprint and slightly reduces the total training time on our hardware.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brett-daley/virtual-replay-cache
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Parallel Computing and Optimization Techniques