Hierarchical Reinforcement Learning with Runtime Safety Shielding for Power Grid Operation
Gitesh Malik

TL;DR
This paper introduces a hierarchical reinforcement learning framework with a runtime safety shield for power grid control, enhancing safety, robustness, and generalization without retraining.
Contribution
It proposes a novel hierarchical control architecture that decouples decision-making from safety enforcement, enabling safe and robust power grid operation in unseen scenarios.
Findings
Hierarchical approach outperforms flat RL under stress tests.
Safety shield ensures real-time safety invariants.
Zero-shot deployment achieves robust performance without retraining.
Abstract
Reinforcement learning has shown promise for automating power-grid operation tasks such as topology control and congestion management. However, its deployment in real-world power systems remains limited by strict safety requirements, brittleness under rare disturbances, and poor generalization to unseen grid topologies. In safety-critical infrastructure, catastrophic failures cannot be tolerated, and learning-based controllers must operate within hard physical constraints. This paper proposes a safety-constrained hierarchical control framework for power-grid operation that explicitly decouples long-horizon decision-making from real-time feasibility enforcement. A high-level reinforcement learning policy proposes abstract control actions, while a deterministic runtime safety shield filters unsafe actions using fast forward simulation. Safety is enforced as a runtime invariant,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
