RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning

Steven A. Senczyszyn; Timothy C. Havens; Nathaniel Rice; Jason E. Summers; Benjamin D. Werner; Benjamin J. Schumeg

arXiv:2604.15201·cs.LG·April 17, 2026

RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning

Steven A. Senczyszyn, Timothy C. Havens, Nathaniel Rice, Jason E. Summers, Benjamin D. Werner, Benjamin J. Schumeg

PDF

TL;DR

RL-STPA is a novel hazard analysis framework that adapts traditional system-theoretic methods to identify safety issues in reinforcement learning for safety-critical systems like autonomous drones.

Contribution

It introduces hierarchical decomposition, coverage-guided testing, and iterative hazard feedback to improve safety evaluation of RL policies.

Findings

01

Revealed potential loss scenarios in autonomous drone navigation.

02

Provided quantitative safety coverage metrics.

03

Demonstrated hazard identification beyond standard RL evaluations.

Abstract

As reinforcement learning (RL) deployments expand into safety-critical domains, existing evaluation methods fail to systematically identify hazards arising from the black-box nature of neural network enabled policies and distributional shift between training and deployment. This paper introduces Reinforcement Learning System-Theoretic Process Analysis (RL-STPA), a framework that adapts conventional STPA's systematic hazard analysis to address RL's unique challenges through three key contributions: hierarchical subtask decomposition using both temporal phase analysis and domain expertise to capture emergent behaviors, coverage-guided perturbation testing that explores the sensitivity of state-action spaces, and iterative checkpoints that feed identified hazards back into training through reward shaping and curriculum design. We demonstrate RL-STPA in the safety-critical test case of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.