Learning classifier systems with memory condition to solve non-Markov problems
Zhaoxiang Zang, Dehua Li, Junying Wang

TL;DR
This paper introduces an enhanced XCS classifier system with memory capabilities to effectively solve non-Markov, partially observable environments, outperforming existing methods in complex maze tasks.
Contribution
It develops a novel XCS-based classifier system with memory conditions and a detection method for non-Markov environments, improving learning in partially observable settings.
Findings
Outperforms existing classifier systems in maze environments
Successfully disambiguates non-Markov states using memory conditions
Demonstrates robustness in complex, partially observable environments
Abstract
In the family of Learning Classifier Systems, the classifier system XCS has been successfully used for many applications. However, the standard XCS has no memory mechanism and can only learn optimal policy in Markov environments, where the optimal action is determined solely by the state of current sensory input. In practice, most environments are partially observable environments on agent's sensation, which are also known as non-Markov environments. Within these environments, XCS either fails, or only develops a suboptimal policy, since it has no memory. In this work, we develop a new classifier system based on XCS to tackle this problem. It adds an internal message list to XCS as the memory list to record input sensation history, and extends a small number of classifiers with memory conditions. The classifier's memory condition, as a foothold to disambiguate non-Markov states, is used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
