Action Noise in Off-Policy Deep Reinforcement Learning: Impact on   Exploration and Performance

Jakob Hollenstein; Sayantan Auddy; Matteo Saveriano; Erwan Renaudo,; Justus Piater

arXiv:2206.03787·cs.LG·June 6, 2023·6 cites

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo,, Justus Piater

PDF

Open Access

TL;DR

This paper investigates how different types and scales of action noise affect exploration and performance in off-policy deep reinforcement learning, proposing new measures and heuristics for optimal noise configuration.

Contribution

The study systematically analyzes the impact of Gaussian and Ornstein-Uhlenbeck noise on exploration, introduces a robust state-space coverage measure, and offers heuristic guidelines for noise scheduling.

Findings

01

Larger noise scales increase state-space coverage but may not improve learning.

02

Reducing noise scale during training generally enhances performance.

03

Environment-dependent noise choices are crucial for optimal exploration.

Abstract

Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of exploration such as the additive action noise often used in continuous control domains. Typically, the scaling factor of this action noise is chosen as a hyper-parameter and is kept constant during training. In this paper, we focus on action noise in off-policy deep reinforcement learning for continuous control. We analyze how the learned policy is impacted by the noise type, noise scale, and impact scaling factor reduction schedule. We consider the two most prominent types of action noise, Gaussian and Ornstein-Uhlenbeck noise, and perform a vast experimental campaign by systematically varying the noise type and scale parameter, and by measuring variables of interest like the expected return of the policy and the state-space coverage during exploration. For the latter, we propose a novel state-space coverage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Software Engineering Research