Value Prediction Network

Junhyuk Oh; Satinder Singh; Honglak Lee

arXiv:1707.03497·cs.AI·November 8, 2017·42 cites

Value Prediction Network

Junhyuk Oh, Satinder Singh, Honglak Lee

PDF

Open Access 2 Repos

TL;DR

The paper introduces Value Prediction Network (VPN), a deep RL architecture that combines model-free and model-based approaches, predicting future values conditioned on options, and shows improved performance in stochastic environments and Atari games.

Contribution

VPN uniquely learns a dynamics model focused on predicting future values conditioned on options, blending model-based and model-free RL into a single neural network.

Findings

01

VPN outperforms baselines in stochastic environments requiring planning.

02

VPN surpasses DQN on several Atari games with short-lookahead planning.

03

VPN demonstrates advantages of integrated model-based and model-free RL methods.

Abstract

This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future values (discounted sum of rewards) rather than of future observations. Our experimental results show that VPN has several advantages over both model-free and model-based baselines in a stochastic environment where careful planning is required but building an accurate observation-prediction model is difficult. Furthermore, VPN outperforms Deep Q-Network (DQN) on several Atari games even with short-lookahead planning, demonstrating its potential as a new way of learning a good state representation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques