A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants
Yixuan Sun, Sami Khairy, Richard B. Vilim, Rui Hu, Akshay J. Dave

TL;DR
This paper introduces a novel chance-constrained reinforcement learning algorithm based on Proximal Policy Optimization, designed for supervisory control of power plants, effectively managing state constraints without explicit environment modeling.
Contribution
It presents a new RL method with Lagrangian relaxation for power plant control, addressing state constraints in a model-free setting.
Findings
Achieved minimal constraint violation in load-follow maneuvers
Demonstrated effectiveness on advanced Nuclear Power Plant design
Outperformed standard RL methods in constraint management
Abstract
Traditional control theory-based methods require tailored engineering for each system and constant fine-tuning. In power plant control, one often needs to obtain a precise representation of the system dynamics and carefully design the control scheme accordingly. Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks due to its ability to learn from trial-and-error interactions with the environment. It eliminates the need for explicitly modeling the environment's dynamics, which is potentially inaccurate. However, the direct imposition of state constraints in power plant control raises challenges for standard RL methods. To address this, we propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control. Our method employs Lagrangian relaxation to convert the constrained optimization problem into an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
