Unsupervised Basis Function Adaptation for Reinforcement Learning

Edward Barker; Charl Ras

arXiv:1703.07940·cs.LG·February 19, 2019·1 cites

Unsupervised Basis Function Adaptation for Reinforcement Learning

Edward Barker, Charl Ras

PDF

Open Access

TL;DR

This paper introduces an online adaptive state aggregation method for reinforcement learning that uses visit frequency feedback to improve value function approximation, demonstrating theoretical benefits and empirical performance gains.

Contribution

The paper presents a novel algorithm for adaptively refining approximation architectures in RL using visit frequency, with theoretical analysis and experimental validation.

Findings

01

Algorithm reduces value function error in tested environments.

02

Adaptive architecture improves RL performance over static methods.

03

Theoretical analysis confirms convergence and complexity bounds.

Abstract

When using reinforcement learning (RL) algorithms it is common, given a large state space, to introduce some form of approximation architecture for the value function (VF). The exact form of this architecture can have a significant effect on an agent's performance, however, and determining a suitable approximation architecture can often be a highly complex task. Consequently there is currently interest among researchers in the potential for allowing RL algorithms to adaptively generate (i.e. to learn) approximation architectures. One relatively unexplored method of adapting approximation architectures involves using feedback regarding the frequency with which an agent has visited certain states to guide which areas of the state space to approximate with greater detail. In this article we will: (a) informally discuss the potential advantages offered by such methods; (b) introduce a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization