Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning
Zhaoxiang Zang, Zhao Li, Junying Wang, Zhiping Dan

TL;DR
This paper introduces a modification to zeroth-level classifier systems by integrating R-learning for average reward reinforcement and tournament selection, enabling the system to handle large multi-step problems more effectively.
Contribution
It replaces the traditional discounted reward approach with R-learning and uses tournament selection, enhancing ZCS's ability to solve large multi-step problems.
Findings
Supports long action chains
Improves performance on large multi-step problems
Enhances scalability of ZCS
Abstract
As a genetics-based machine learning technique, zeroth-level classifier system (ZCS) is based on a discounted reward reinforcement learning algorithm, bucket-brigade algorithm, which optimizes the discounted total reward received by an agent but is not suitable for all multi-step problems, especially large-size ones. There are some undiscounted reinforcement learning methods available, such as R-learning, which optimize the average reward per time step. In this paper, R-learning is used as the reinforcement learning employed by ZCS, to replace its discounted reward reinforcement learning approach, and tournament selection is used to replace roulette wheel selection in ZCS. The modification results in classifier systems that can support long action chains, and thus is able to solve large multi-step problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Viral Infectious Diseases and Gene Expression in Insects
