Surpassing legacy approaches to PWR core reload optimization with single-objective Reinforcement learning
Paul Seurin, Koroush Shirvan

TL;DR
This paper demonstrates that Deep Reinforcement Learning, specifically Proximal Policy Optimization (PPO), outperforms traditional stochastic optimization methods in nuclear reactor core reload pattern optimization, offering a more effective approach for complex, multi-objective problems.
Contribution
The study introduces a DRL-based method using PPO for core reload optimization, showing its superiority over common stochastic algorithms in efficiency and solution quality.
Findings
PPO outperforms genetic algorithms, simulated annealing, and tabu search.
PPO effectively balances global and local search capabilities.
Statistical analysis confirms PPO's superiority in long-term optimization runs.
Abstract
Optimizing the fuel cycle cost through the optimization of nuclear reactor core loading patterns involves multiple objectives and constraints, leading to a vast number of candidate solutions that cannot be explicitly solved. To advance the state-of-the-art in core reload patterns, we have developed methods based on Deep Reinforcement Learning (DRL) for both single- and multi-objective optimization. Our previous research has laid the groundwork for these approaches and demonstrated their ability to discover high-quality patterns within a reasonable time frame. On the other hand, stochastic optimization (SO) approaches are commonly used in the literature, but there is no rigorous explanation that shows which approach is better in which scenario. In this paper, we demonstrate the advantage of our RL-based approach, specifically using Proximal Policy Optimization (PPO), against the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Nuclear reactor physics and engineering · Global Energy and Sustainability Research
MethodsEntropy Regularization · Proximal Policy Optimization · Focus
