Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning
Majdi I. Radaideh, Leo Tunkle, Dean Price, Kamal Abdulraheem, Linyu, Lin, Moutaz Elias

TL;DR
This paper demonstrates the effectiveness of reinforcement learning algorithms, specifically PPO, for autonomous control of microreactors, achieving near-optimal power distribution and criticality maintenance in high-fidelity simulations.
Contribution
It introduces the application of advanced deep reinforcement learning techniques, PPO and A2C, for microreactor control, utilizing a surrogate model trained on detailed simulation data.
Findings
PPO achieved a power tilt ratio of approximately 1.002.
PPO maintained criticality within 10 pcm.
A2C was less effective than PPO in this control task.
Abstract
Reducing operation and maintenance costs is a key objective for advanced reactors in general and microreactors in particular. To achieve this reduction, developing robust autonomous control algorithms is essential to ensure safe and autonomous reactor operation. Recently, artificial intelligence and machine learning algorithms, specifically reinforcement learning (RL) algorithms, have seen rapid increased application to control problems, such as plasma control in fusion tokamaks and building energy management. In this work, we introduce the use of RL for intelligent control in nuclear microreactors. The RL agent is trained using proximal policy optimization (PPO) and advantage actor-critic (A2C), cutting-edge deep RL techniques, based on a high-fidelity simulation of a microreactor design inspired by the Westinghouse eVinci\textsuperscript{TM} design. We utilized a Serpent model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetaheuristic Optimization Algorithms Research · Molecular Communication and Nanonetworks · Quantum Computing Algorithms and Architecture
MethodsA2C · Entropy Regularization · Proximal Policy Optimization
