# WALL-E: An Efficient Reinforcement Learning Research Framework

**Authors:** Tianbing Xu, Andrew Zhang, Liang Zhao

arXiv: 1901.06086 · 2019-01-29

## TL;DR

WALL-E introduces a parallel multi-process framework for reinforcement learning that significantly accelerates experience collection and improves policy performance, demonstrated on MuJoCo tasks.

## Contribution

The paper presents WALL-E, a novel framework that uses multiple parallel samplers to speed up experience collection in RL systems, leading to faster convergence and higher rewards.

## Key findings

- Faster convergence times with parallel samplers.
- Higher average rewards achieved on MuJoCo tasks.
- Effective scalability with increased sampler processes.

## Abstract

There are two halves to RL systems: experience collection time and policy learning time. For a large number of samples in rollouts, experience collection time is the major bottleneck. Thus, it is necessary to speed up the rollout generation time with multi-process architecture support. Our work, dubbed WALL-E, utilizes multiple rollout samplers running in parallel to rapidly generate experience. Due to our parallel samplers, we experience not only faster convergence times, but also higher average reward thresholds. For example, on the MuJoCo HalfCheetah-v2 task, with $N = 10$ parallel sampler processes, we are able to achieve much higher average return than those from using only a single process architecture.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.06086/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1901.06086/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/1901.06086/full.md

---
Source: https://tomesphere.com/paper/1901.06086