Learning to Act Safely with Limited Exposure and Almost Sure Certainty

Agustin Castellano; Hancheng Min; Juan Bazerque; Enrique Mallada

arXiv:2105.08748·eess.SY·February 14, 2023

Learning to Act Safely with Limited Exposure and Almost Sure Certainty

Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada

PDF

Open Access

TL;DR

This paper introduces methods for learning safe actions in unknown environments with finite exploration, balancing optimality, safety exposure, and detection time, applicable to bandit problems and MDPs.

Contribution

It proposes algorithms that guarantee detection of unsafe actions in finite expected steps, revealing trade-offs between safety, exploration, and learning speed.

Findings

01

Algorithms detect unsafe actions in finite time

02

Trade-offs between safety exposure and detection speed

03

Safety constraints can accelerate learning

Abstract

This paper puts forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials. This is indeed possible, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multi-armed bandit problem and study the intrinsic trade-offs of learning safety in the presence of uncertainty. Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Data Stream Mining Techniques