Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and   Some New Implementations

Dimitri P. Bertsekas

arXiv:1804.04577·cs.LG·August 23, 2018

Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations

Dimitri P. Bertsekas

PDF

TL;DR

This paper surveys feature-based aggregation methods in reinforcement learning, introduces new implementations combining deep neural networks with aggregation, and discusses their potential for more accurate policy improvement in finite-state Markov decision problems.

Contribution

It presents a novel approach integrating feature-based aggregation with deep neural networks for approximate policy iteration in reinforcement learning.

Findings

01

Feature-based aggregation can improve policy approximation accuracy.

02

Deep neural networks enhance feature construction for better policy improvement.

03

The proposed methods outperform traditional neural network-based RL in certain scenarios.

Abstract

In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.