Exploratory Gradient Boosting for Reinforcement Learning in Complex   Domains

David Abel; Alekh Agarwal; Fernando Diaz; Akshay Krishnamurthy; Robert; E. Schapire

arXiv:1603.04119·cs.AI·March 15, 2016·24 cites

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert, E. Schapire

PDF

Open Access 1 Repo

TL;DR

This paper introduces a gradient-boosting function approximator and an exploration strategy to improve reinforcement learning in high-dimensional, complex environments, demonstrating significant performance gains on realistic benchmarks.

Contribution

It presents a novel non-parametric gradient-boosting approach for Q-function approximation and an exploration method inspired by state abstraction, both tailored for complex, high-dimensional RL tasks.

Findings

01

Outperforms baselines on high-dimensional Minecraft tasks

02

Maintains competitive performance on standard RL benchmarks

03

Provides new benchmarks for visual RL environments

Abstract

High-dimensional observations and complex real-world dynamics present major challenges in reinforcement learning for both function approximation and exploration. We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$ -function residuals. And second, we propose an exploration strategy inspired by the principles of state abstraction and information acquisition under uncertainty. We demonstrate the empirical effectiveness of these techniques, first, as a preliminary check, on two standard tasks (Blackjack and $n$ -Chain), and then on two much larger and more realistic tasks with high-dimensional observation spaces. Specifically, we introduce two benchmarks built within the game Minecraft where the observations are pixel arrays of the agent's visual field. A combination of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wattlebirdaz/geql
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)