Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning
Yitzhak Spielberg, Amos Azaria

TL;DR
This paper introduces a criticality-based varying step number algorithm for reinforcement learning that adapts step lengths based on state importance, improving performance across multiple domains.
Contribution
It proposes a novel CVS algorithm utilizing criticality to dynamically adjust step numbers, enhancing learning efficiency and effectiveness.
Findings
CVS outperforms Deep Q-Learning and Monte Carlo methods.
The criticality concept effectively guides step size adaptation.
Demonstrated success across diverse environments.
Abstract
In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That is, a state in which the choice of action is more likely to influence the final outcome is considered as more critical than a state in which it is less likely to influence the final outcome. We formulate a criticality-based varying step number algorithm (CVS) - a flexible step number algorithm that utilizes the criticality function provided by a human, or learned directly from the environment. We test it in three different domains including the Atari Pong environment, Road-Tree environment, and Shooter environment. We demonstrate that CVS is able to outperform popular learning algorithms such as Deep Q-Learning and Monte Carlo.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics
MethodsQ-Learning
