Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems
Haozhe Tian, Homayoun Hamedmoghadam, Robert Shorten, Pietro Ferraro

TL;DR
This paper introduces RL-AR, an adaptive regularization method for reinforcement learning that ensures safety in critical systems by dynamically balancing exploration and safety constraints, achieving safe and effective control.
Contribution
RL-AR is a novel algorithm that adaptively combines safe policy regularization with RL, improving safety without sacrificing learning performance in critical control tasks.
Findings
RL-AR guarantees safety during training.
RL-AR achieves competitive returns with standard RL.
RL-AR adapts focus based on state exploration level.
Abstract
Reinforcement Learning (RL) is a powerful method for controlling dynamic systems, but its learning mechanism can lead to unpredictable actions that undermine the safety of critical systems. Here, we propose RL with Adaptive Regularization (RL-AR), an algorithm that enables safe RL exploration by combining the RL policy with a policy regularizer that hard-codes the safety constraints. RL-AR performs policy combination via a "focus module," which determines the appropriate combination depending on the state--relying more on the safe policy regularizer for less-exploited states while allowing unbiased convergence for well-exploited states. In a series of critical control applications, we demonstrate that RL-AR not only ensures safety during training but also achieves a return competitive with the standards of model-free RL that disregards safety.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFault Detection and Control Systems · Advanced Control Systems Optimization · Adaptive Dynamic Programming Control
MethodsFocus
