Memory-based control with recurrent neural networks

Nicolas Heess; Jonathan J Hunt; Timothy P Lillicrap; David Silver

arXiv:1512.04455·cs.LG·December 15, 2015·219 cites

Memory-based control with recurrent neural networks

Nicolas Heess, Jonathan J Hunt, Timothy P Lillicrap, David Silver

PDF

Open Access 3 Repos

TL;DR

This paper extends model-free reinforcement learning algorithms with recurrent neural networks to effectively handle partially observed control problems, demonstrating success across various memory-dependent tasks including pixel-based observations.

Contribution

It introduces a method combining deterministic and stochastic policy gradients with RNNs trained via backpropagation through time for partially observed control tasks.

Findings

01

Recurrent policies solve diverse memory tasks including noisy sensor integration.

02

The approach handles high-dimensional pixel observations directly.

03

Recurrent deterministic and stochastic policies perform similarly on complex tasks.

Abstract

Partially observed control problems are a challenging aspect of reinforcement learning. We extend two related, model-free algorithms for continuous control -- deterministic policy gradient and stochastic value gradient -- to solve partially observed domains using recurrent neural networks trained with backpropagation through time. We demonstrate that this approach, coupled with long-short term memory is able to solve a variety of physical control problems exhibiting an assortment of memory requirements. These include the short-term integration of information from noisy sensors and the identification of system parameters, as well as long-term memory problems that require preserving information over many time steps. We also demonstrate success on a combined exploration and memory problem in the form of a simplified version of the well-known Morris water maze task. Finally, we show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural Networks and Applications · Adaptive Dynamic Programming Control