Reverb: A Framework For Experience Replay

Albin Cassirer; Gabriel Barth-Maron; Eugene Brevdo; Sabela Ramos; Toby; Boyd; Thibault Sottiaux; Manuel Kroiss

arXiv:2102.04736·cs.LG·February 10, 2021·6 cites

Reverb: A Framework For Experience Replay

Albin Cassirer, Gabriel Barth-Maron, Eugene Brevdo, Sabela Ramos, Toby, Boyd, Thibault Sottiaux, Manuel Kroiss

PDF

Open Access 1 Repo

TL;DR

Reverb is a scalable, flexible system for experience replay in reinforcement learning, enabling efficient data management in distributed settings and improving RL training performance.

Contribution

Introduces Reverb, a novel system for experience replay that is efficient, extensible, and suitable for large-scale distributed reinforcement learning.

Findings

01

Reverb efficiently handles thousands of concurrent clients.

02

The system provides flexible configuration options for replay strategies.

03

Empirical results demonstrate Reverb's high performance in RL training scenarios.

Abstract

A central component of training in Reinforcement Learning (RL) is Experience: the data used for training. The mechanisms used to generate and consume this data have an important effect on the performance of RL algorithms. In this paper, we introduce Reverb: an efficient, extensible, and easy to use system designed specifically for experience replay in RL. Reverb is designed to work efficiently in distributed configurations with up to thousands of concurrent clients. The flexible API provides users with the tools to easily and accurately configure the replay buffer. It includes strategies for selecting and removing elements from the buffer, as well as options for controlling the ratio between sampled and inserted elements. This paper presents the core design of Reverb, gives examples of how it can be applied, and provides empirical results of Reverb's performance characteristics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deepmind/reverb
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Data Stream Mining Techniques

MethodsExperience Replay