# Reinforcement Learning via Recurrent Convolutional Neural Networks

**Authors:** Tanmay Shankar, Santosha K. Dwivedy, Prithwijit Guha

arXiv: 1701.02392 · 2017-01-11

## TL;DR

This paper introduces Recurrent Convolutional Neural Networks (RCNNs) for reinforcement learning, enabling explicit modeling of environment structure and improved planning in partially observable tasks.

## Contribution

It presents a novel RCNN-based framework that learns transition and reward models, combining model-free and model-based RL advantages.

## Key findings

- RCNNs effectively learn accurate MDP models
- Framework reduces replanning costs in robot planning
- Achieves near-optimal policies through learned models

## Abstract

Deep Reinforcement Learning has enabled the learning of policies for complex tasks in partially observable environments, without explicitly learning the underlying model of the tasks. While such model-free methods achieve considerable performance, they often ignore the structure of task. We present a natural representation of to Reinforcement Learning (RL) problems using Recurrent Convolutional Neural Networks (RCNNs), to better exploit this inherent structure. We define 3 such RCNNs, whose forward passes execute an efficient Value Iteration, propagate beliefs of state in partially observable environments, and choose optimal actions respectively. Backpropagating gradients through these RCNNs allows the system to explicitly learn the Transition Model and Reward Function associated with the underlying MDP, serving as an elegant alternative to classical model-based RL. We evaluate the proposed algorithms in simulation, considering a robot planning problem. We demonstrate the capability of our framework to reduce the cost of replanning, learn accurate MDP models, and finally re-plan with learnt models to achieve near-optimal policies.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1701.02392/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1701.02392/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1701.02392/full.md

---
Source: https://tomesphere.com/paper/1701.02392