Deep Reinforcement Learning in a Handful of Trials using Probabilistic   Dynamics Models

Kurtland Chua; Roberto Calandra; Rowan McAllister; Sergey; Levine

arXiv:1805.12114·cs.LG·November 5, 2018·185 cites

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey, Levine

PDF

Open Access 5 Repos

TL;DR

This paper introduces PETS, a model-based RL algorithm using probabilistic ensembles and trajectory sampling, achieving comparable performance to model-free methods with far fewer samples.

Contribution

The paper presents PETS, a novel uncertainty-aware dynamics model combined with sampling for improved sample efficiency in deep RL.

Findings

01

PETS matches the asymptotic performance of state-of-the-art model-free algorithms.

02

PETS requires 8 to 125 times fewer samples than Soft Actor Critic and PPO.

03

The approach is effective on several challenging benchmark tasks.

Abstract

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125 times fewer samples than Soft Actor Critic and Proximal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Simulation Techniques and Applications

MethodsExperience Replay · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Soft Actor Critic