TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep   Reinforcement Learning

Gregory Farquhar; Tim Rockt\"aschel; Maximilian Igl; Shimon Whiteson

arXiv:1710.11417·cs.AI·March 9, 2018·28 cites

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

Gregory Farquhar, Tim Rockt\"aschel, Maximilian Igl, Shimon Whiteson

PDF

Open Access 1 Repo

TL;DR

This paper introduces TreeQN and ATreeC, differentiable tree-structured models for deep reinforcement learning that improve planning and value estimation in complex environments by learning transition models end-to-end.

Contribution

The paper presents TreeQN and ATreeC, novel differentiable tree-structured models that integrate learned transition models into deep RL, enabling better planning and policy learning.

Findings

01

TreeQN and ATreeC outperform baseline methods on a box-pushing task.

02

They also outperform n-step DQN and value prediction networks on Atari games.

03

Ablation studies show the importance of auxiliary losses for learning transition models.

Abstract

Combining deep model-free reinforcement learning with on-line planning is a promising approach to building on the successes of deep RL. On-line planning with look-ahead trees has proven successful in environments where transition models are known a priori. However, in complex environments where transition models need to be learned from data, the deficiencies of learned models have limited their utility for planning. To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions. TreeQN dynamically constructs a tree by recursively applying a transition model in a learned abstract state space and then aggregating predicted rewards and state-values using a tree backup to estimate Q-values. We also propose ATreeC, an actor-critic variant that augments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oxwhirl/treeqn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsQ-Learning · A2C · Dense Connections · Convolution · Deep Q-Network