Criticality-Based Varying Step-Number Algorithm for Reinforcement   Learning

Yitzhak Spielberg; Amos Azaria

arXiv:2201.05034·cs.LG·January 14, 2022

Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning

Yitzhak Spielberg, Amos Azaria

PDF

Open Access

TL;DR

This paper introduces a criticality-based varying step number algorithm for reinforcement learning that adapts step lengths based on state importance, improving performance across multiple domains.

Contribution

It proposes a novel CVS algorithm utilizing criticality to dynamically adjust step numbers, enhancing learning efficiency and effectiveness.

Findings

01

CVS outperforms Deep Q-Learning and Monte Carlo methods.

02

The criticality concept effectively guides step size adaptation.

03

Demonstrated success across diverse environments.

Abstract

In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That is, a state in which the choice of action is more likely to influence the final outcome is considered as more critical than a state in which it is less likely to influence the final outcome. We formulate a criticality-based varying step number algorithm (CVS) - a flexible step number algorithm that utilizes the criticality function provided by a human, or learned directly from the environment. We test it in three different domains including the Atari Pong environment, Road-Tree environment, and Shooter environment. We demonstrate that CVS is able to outperform popular learning algorithms such as Deep Q-Learning and Monte Carlo.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics

MethodsQ-Learning