A framework for reinforcement learning with autocorrelated actions

Marcin Szulc; Jakub {\L}yskawa; Pawe{\l} Wawrzy\'nski

arXiv:2009.04777·cs.LG·September 11, 2020

A framework for reinforcement learning with autocorrelated actions

Marcin Szulc, Jakub {\L}yskawa, Pawe{\l} Wawrzy\'nski

PDF

1 Repo

TL;DR

This paper introduces a reinforcement learning framework utilizing autocorrelated actions, which improves learning efficiency and physical implementation in robotics by reducing unwanted shaking, and demonstrates superior performance in several simulated control tasks.

Contribution

The paper presents a novel RL algorithm that optimizes autocorrelated policies, offering advantages over traditional methods in both learning effectiveness and practical robotic applications.

Findings

01

Outperforms other algorithms in three of four benchmark tasks

02

Reduces robot shaking during control implementation

03

Enhances policy learning through autocorrelation in actions

Abstract

The subject of this paper is reinforcement learning. Policies are considered here that produce actions based on states and random elements autocorrelated in subsequent time instants. Consequently, an agent learns from experiments that are distributed over time and potentially give better clues to policy improvement. Also, physical implementation of such policies, e.g. in robotics, is less problematic, as it avoids making robots shake. This is in opposition to most RL algorithms which add white noise to control causing unwanted shaking of the robots. An algorithm is introduced here that approximately optimizes the aforementioned policy. Its efficiency is verified for four simulated learning control problems (Ant, HalfCheetah, Hopper, and Walker2D) against three other methods (PPO, SAC, ACER). The algorithm outperforms others in three of these problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mszulc913/acerac
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDilated Convolution · Global Average Pooling · Average Pooling · Convolution · 1x1 Convolution · Switchable Atrous Convolution