Control Frequency Adaptation via Action Persistence in Batch   Reinforcement Learning

Alberto Maria Metelli; Flavio Mazzolini; Lorenzo Bisi; Luca Sabbioni,; Marcello Restelli

arXiv:2002.06836·cs.LG·July 14, 2020·6 cites

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

Alberto Maria Metelli, Flavio Mazzolini, Lorenzo Bisi, Luca Sabbioni,, Marcello Restelli

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores how controlling the frequency of actions through persistence can improve reinforcement learning performance, introducing a new algorithm and methods to optimize this persistence for better policy learning.

Contribution

It introduces the concept of action persistence in RL, develops the PFQI algorithm to learn optimal value functions at different persistence levels, and proposes a heuristic for selecting optimal persistence.

Findings

01

Action persistence can significantly improve RL policy performance.

02

PFQI effectively learns optimal value functions with different persistence levels.

03

The proposed heuristic successfully identifies the best persistence setting.

Abstract

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy. In this paper, we introduce the notion of action persistence that consists in the repetition of an action for a fixed number of decision steps, having the effect of modifying the control frequency. We start analyzing how action persistence affects the performance of the optimal policy, and then we present a novel algorithm, Persistent Fitted Q-Iteration (PFQI), that extends FQI, with the goal of learning the optimal value function at a given persistence. After having provided a theoretical study of PFQI and a heuristic approach to identify the optimal persistence, we present an experimental campaign on benchmark domains to show the advantages of action persistence and proving the effectiveness of our persistence selection method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

albertometelli/pfqi
noneOfficial

Videos

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Data Stream Mining Techniques