# Deep Neuroevolution of Recurrent and Discrete World Models

**Authors:** Sebastian Risi, Kenneth O. Stanley

arXiv: 1906.08857 · 2019-06-24

## TL;DR

This paper shows that complex world models with recurrent and discrete components can be effectively trained end-to-end using genetic algorithms, matching traditional methods in performance and enabling new planning approaches.

## Contribution

It demonstrates that deep neuroevolution can train complex, multi-component world models end-to-end, including discrete variables, without separate training stages.

## Key findings

- Genetic algorithms achieve performance comparable to gradient-based training.
- Evolved models develop similar internal representations to gradient-trained models.
- GAs enable direct optimization of discrete variables, facilitating classical planning.

## Abstract

Neural architectures inspired by our own human cognitive system, such as the recently introduced world models, have been shown to outperform traditional deep reinforcement learning (RL) methods in a variety of different domains. Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual information processing, memory, and decision-making. However, so far the components of these models have to be trained separately and through a variety of specialized training methods. This paper demonstrates the surprising finding that models with the same precise parts can be instead efficiently trained end-to-end through a genetic algorithm (GA), reaching a comparable performance to the original world model by solving a challenging car racing task. An analysis of the evolved visual and memory system indicates that they include a similar effective representation to the system trained through gradient descent. Additionally, in contrast to gradient descent methods that struggle with discrete variables, GAs also work directly with such representations, opening up opportunities for classical planning in latent space. This paper adds additional evidence on the effectiveness of deep neuroevolution for tasks that require the intricate orchestration of multiple components in complex heterogeneous architectures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.08857/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1906.08857/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1906.08857/full.md

---
Source: https://tomesphere.com/paper/1906.08857