Safety-Critical Reinforcement Learning with Viability-Based Action Shielding for Hypersonic Longitudinal Flight
Hossein Rastgoftar

TL;DR
This paper introduces a reinforcement learning framework that ensures safety in controlling hypersonic vehicles by using reachability-based action shielding and state abstraction, preventing unsafe behaviors during learning and operation.
Contribution
It develops a viability-based action shielding method combined with state abstraction for safe reinforcement learning in nonlinear dynamical systems with continuous states and inputs.
Findings
Successfully applied to hypersonic vehicle model
Ensures safety constraints are never violated during learning
Improves recovery behavior within control architecture
Abstract
This paper presents a safety-critical reinforcement learning framework for nonlinear dynamical systems with continuous state and input spaces operating under explicit physical constraints. Hard safety constraints are enforced independently of the reward through action shielding and reachability-based admissible action sets, ensuring that unsafe behaviors are never intentionally selected during learning or execution. To capture nominal operation and recovery behavior within a single control architecture, the state space is partitioned into safe and unsafe regions based on membership in a safety box, and a mode-dependent reward is used to promote accurate tracking inside the safe region and recovery toward it when operating outside. To enable online tabular learning on continuous dynamics, a finite-state abstraction is constructed via state aggregation, and action selection and value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Model Reduction and Neural Networks · Reinforcement Learning in Robotics
