ACERAC: Efficient reinforcement learning in fine time discretization

Jakub {\L}yskawa; Pawe{\l} Wawrzy\'nski

arXiv:2104.04004·cs.LG·July 12, 2022

ACERAC: Efficient reinforcement learning in fine time discretization

Jakub {\L}yskawa, Pawe{\l} Wawrzy\'nski

PDF

Open Access

TL;DR

This paper introduces a new reinforcement learning framework and algorithm that effectively handles fine time discretization in control systems by allowing dependent stochastic actions, improving performance over existing methods.

Contribution

The paper proposes an RL framework with dependent stochastic actions and an algorithm that optimizes multi-step returns, addressing challenges in fine time discretization control.

Findings

01

The proposed algorithm outperforms CDAU, PPO, SAC, and ACER in most tested scenarios.

02

It effectively manages dependent actions over time, reducing jerkiness in control.

03

Demonstrated improved sample efficiency and control quality in simulated environments.

Abstract

One of the main goals of reinforcement learning (RL) is to provide a~way for physical machines to learn optimal behavior instead of being programmed. However, effective control of the machines usually requires fine time discretization. The most common RL methods apply independent random elements to each action, which is not suitable in that setting. It is not feasible because it causes the controlled system to jerk, and does not ensure sufficient exploration since a~single action is not long enough to create a~significant experience that could be translated into policy improvement. In our view these are the main obstacles that prevent application of RL in contemporary control systems. To address these pitfalls, in this paper we introduce an RL framework and adequate analytical tools for actions that may be stochastically dependent in subsequent time instances. We also introduce an RL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Scheduling and Optimization Algorithms · Machine Learning and Algorithms

Methods1x1 Convolution · Dilated Convolution · Convolution · Average Pooling · Global Average Pooling · Switchable Atrous Convolution