Neural Network Dynamics for Model-Based Deep Reinforcement Learning with   Model-Free Fine-Tuning

Anusha Nagabandi; Gregory Kahn; Ronald S. Fearing; Sergey Levine

arXiv:1708.02596·cs.LG·December 5, 2017

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine

PDF

5 Repos

TL;DR

This paper introduces a hybrid reinforcement learning approach combining neural network-based model predictive control with model-free fine-tuning, significantly improving sample efficiency and performance in robotic locomotion tasks.

Contribution

It demonstrates that neural network dynamics models can be effectively integrated with MPC for sample-efficient model-based RL and used to initialize model-free learning for better performance.

Findings

01

Model-based approach achieves excellent sample efficiency with simple random data.

02

Hybrid method accelerates model-free learning by 3-5x on various locomotion benchmarks.

03

Neural network dynamics models enable stable and plausible robotic gaits.

Abstract

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that medium-sized neural network models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits to accomplish various complex locomotion tasks. We also propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.