Don't do it: Safer Reinforcement Learning With Rule-based Guidance

Ekaterina Nikonova; Cheng Xue; Jochen Renz

arXiv:2212.13819·cs.AI·December 29, 2022

Don't do it: Safer Reinforcement Learning With Rule-based Guidance

Ekaterina Nikonova, Cheng Xue, Jochen Renz

PDF

Open Access

TL;DR

This paper introduces a rule-based safety mechanism for reinforcement learning that overrides unsafe actions, leading to safer training, faster convergence, and improved performance.

Contribution

It proposes a safe epsilon-greedy algorithm that incorporates safety rules to prevent harmful actions during reinforcement learning.

Findings

01

Significantly increased safety during training.

02

Faster convergence compared to baseline.

03

Achieved better overall performance.

Abstract

During training, reinforcement learning systems interact with the world without considering the safety of their actions. When deployed into the real world, such systems can be dangerous and cause harm to their surroundings. Often, dangerous situations can be mitigated by defining a set of rules that the system should not violate under any conditions. For example, in robot navigation, one safety rule would be to avoid colliding with surrounding objects and people. In this work, we define safety rules in terms of the relationships between the agent and objects and use them to prevent reinforcement learning systems from performing potentially harmful actions. We propose a new safe epsilon-greedy algorithm that uses safety rules to override agents' actions if they are considered to be unsafe. In our experiments, we show that a safe epsilon-greedy policy significantly increases the safety of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Occupational Health and Safety Research · Autonomous Vehicle Technology and Safety

MethodsBalanced Selection