Unsupervised Basis Function Adaptation for Reinforcement Learning
Edward Barker, Charl Ras

TL;DR
This paper introduces an online adaptive state aggregation method for reinforcement learning that uses visit frequency feedback to improve value function approximation, demonstrating theoretical benefits and empirical performance gains.
Contribution
The paper presents a novel algorithm for adaptively refining approximation architectures in RL using visit frequency, with theoretical analysis and experimental validation.
Findings
Algorithm reduces value function error in tested environments.
Adaptive architecture improves RL performance over static methods.
Theoretical analysis confirms convergence and complexity bounds.
Abstract
When using reinforcement learning (RL) algorithms it is common, given a large state space, to introduce some form of approximation architecture for the value function (VF). The exact form of this architecture can have a significant effect on an agent's performance, however, and determining a suitable approximation architecture can often be a highly complex task. Consequently there is currently interest among researchers in the potential for allowing RL algorithms to adaptively generate (i.e. to learn) approximation architectures. One relatively unexplored method of adapting approximation architectures involves using feedback regarding the frequency with which an agent has visited certain states to guide which areas of the state space to approximate with greater detail. In this article we will: (a) informally discuss the potential advantages offered by such methods; (b) introduce a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization
