On the Expressivity of Neural Networks for Deep Reinforcement Learning

Kefan Dong; Yuping Luo; Tengyu Ma

arXiv:1910.05927·cs.LG·September 8, 2020·1 cites

On the Expressivity of Neural Networks for Deep Reinforcement Learning

Kefan Dong, Yuping Luo, Tengyu Ma

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the expressive limitations of neural networks in deep reinforcement learning, showing that many MDPs have optimal policies more complex than their dynamics, favoring model-based planning and proposing a bootstrapping method to enhance policy performance.

Contribution

The paper provides theoretical and empirical evidence on the expressive power gap in neural networks for RL and introduces BOOTS, a simple multi-step model-based planner to improve policies.

Findings

01

Model-based planning better approximates optimal policies in complex MDPs.

02

Applying BOOTS improves performance on MuJoCo tasks.

03

Optimal policies can be more complex than dynamics even in simple state spaces.

Abstract

We compare the model-free reinforcement learning with the model-based approaches through the lens of the expressive power of neural networks for policies, $Q$ -functions, and dynamics. We show, theoretically and empirically, that even for one-dimensional continuous state space, there are many MDPs whose optimal $Q$ -functions and policies are much more complex than the dynamics. We hypothesize many real-world MDPs also have a similar property. For these MDPs, model-based planning is a favorable algorithm, because the resulting policies can approximate the optimal policy significantly better than a neural network parameterization can, and model-free or model-based policy optimization rely on policy parameterization. Motivated by the theory, we apply a simple multi-step model-based bootstrapping planner (BOOTS) to bootstrap a weak $Q$ -function into a stronger policy. Empirical results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

roosephu/boots
tfOfficial

Videos

On the Expressivity of Neural Networks for Deep Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms

MethodsTest