Optimality-based Analysis of XCSF Compaction in Discrete Reinforcement   Learning

Jordan T. Bishop; Marcus Gallagher

arXiv:2009.01476·cs.LG·September 4, 2020

Optimality-based Analysis of XCSF Compaction in Discrete Reinforcement Learning

Jordan T. Bishop, Marcus Gallagher

PDF

1 Repo

TL;DR

This paper analyzes how compaction techniques can reduce the population size of XCSF, a learning classifier system used as a Q-function approximator in reinforcement learning, without sacrificing performance.

Contribution

It introduces a novel compaction algorithm called GNMC and demonstrates its effectiveness in reducing population size while maintaining accuracy in XCSF.

Findings

01

GNMC preserves or improves function approximation error.

02

GNMC significantly reduces population size.

03

Policy accuracy is reasonably preserved.

Abstract

Learning classifier systems (LCSs) are population-based predictive systems that were originally envisioned as agents to act in reinforcement learning (RL) environments. These systems can suffer from population bloat and so are amenable to compaction techniques that try to strike a balance between population size and performance. A well-studied LCS architecture is XCSF, which in the RL setting acts as a Q-function approximator. We apply XCSF to a deterministic and stochastic variant of the FrozenLake8x8 environment from OpenAI Gym, with its performance compared in terms of function approximation error and policy accuracy to the optimal Q-functions and policies produced by solving the environments via dynamic programming. We then introduce a novel compaction algorithm (Greedy Niche Mass Compaction - GNMC) and study its operation on XCSF's trained populations. Results show that given a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jtbish/ppsn2020
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.