Hysteresis-Based RL: Robustifying Reinforcement Learning-based Control   Policies via Hybrid Control

Jan de Priester; Ricardo G. Sanfelice; Nathan van de Wouw

arXiv:2204.00654·cs.LG·April 5, 2022

Hysteresis-Based RL: Robustifying Reinforcement Learning-based Control Policies via Hybrid Control

Jan de Priester, Ricardo G. Sanfelice, Nathan van de Wouw

PDF

Open Access 2 Repos

TL;DR

This paper introduces HyRL, a hybrid reinforcement learning algorithm that enhances robustness of control policies by incorporating hysteresis switching and dual learning stages, addressing limitations of PPO and DQN in complex systems.

Contribution

The paper proposes HyRL, a novel hybrid RL method that improves robustness of control policies through hysteresis switching and two-stage learning, demonstrated on challenging control problems.

Findings

01

HyRL outperforms PPO and DQN in robustness on tested problems.

02

Hysteresis switching enhances stability of learned policies.

03

Two-stage learning improves policy robustness and performance.

Abstract

Reinforcement learning (RL) is a promising approach for deriving control policies for complex systems. As we show in two control problems, the derived policies from using the Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) algorithms may lack robustness guarantees. Motivated by these issues, we propose a new hybrid algorithm, which we call Hysteresis-Based RL (HyRL), augmenting an existing RL algorithm with hysteresis switching and two stages of learning. We illustrate its properties in two examples for which PPO and DQN fail.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsQ-Learning · Entropy Regularization · Convolution · Dense Connections · Proximal Policy Optimization · Deep Q-Network