Refined Continuous Control of DDPG Actors via Parametrised Activation

Mohammed Hossny; Julie Iskander; Mohammed Attia; Khaled Saleh

arXiv:2006.02818·cs.LG·June 5, 2020

Refined Continuous Control of DDPG Actors via Parametrised Activation

Mohammed Hossny, Julie Iskander, Mohammed Attia, Khaled Saleh

PDF

TL;DR

This paper introduces a method to improve reinforcement learning actors by parameterizing activation functions, enabling better adaptation to actuator discrepancies and enhancing robustness in control tasks.

Contribution

It proposes a novel approach to learn tuning parameters for activation functions in actor-critic agents, improving their adaptability to actuator variations.

Findings

01

Significant reward improvements in LunarLander and BipedalWalker environments.

02

More stable actuation signals compared to existing methods.

03

Enhanced robustness and transferability for real-world actuator scenarios.

Abstract

In this paper, we propose enhancing actor-critic reinforcement learning agents by parameterising the final actor layer which produces the actions in order to accommodate the behaviour discrepancy of different actuators, under different load conditions during interaction with the environment. We propose branching the action producing layer in the actor to learn the tuning parameter controlling the activation layer (e.g. Tanh and Sigmoid). The learned parameters are then used to create tailored activation functions for each actuator. We ran experiments on three OpenAI Gym environments, i.e. Pendulum-v0, LunarLanderContinuous-v2 and BipedalWalker-v2. Results have shown an average of 23.15% and 33.80% increase in total episode reward of the LunarLanderContinuous-v2 and BipedalWalker-v2 environments, respectively. There was no significant improvement in Pendulum-v0 environment but the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.