Surpassing legacy approaches to PWR core reload optimization with   single-objective Reinforcement learning

Paul Seurin; Koroush Shirvan

arXiv:2402.11040·cs.NE·July 16, 2024·2 cites

Surpassing legacy approaches to PWR core reload optimization with single-objective Reinforcement learning

Paul Seurin, Koroush Shirvan

PDF

Open Access

TL;DR

This paper demonstrates that Deep Reinforcement Learning, specifically Proximal Policy Optimization (PPO), outperforms traditional stochastic optimization methods in nuclear reactor core reload pattern optimization, offering a more effective approach for complex, multi-objective problems.

Contribution

The study introduces a DRL-based method using PPO for core reload optimization, showing its superiority over common stochastic algorithms in efficiency and solution quality.

Findings

01

PPO outperforms genetic algorithms, simulated annealing, and tabu search.

02

PPO effectively balances global and local search capabilities.

03

Statistical analysis confirms PPO's superiority in long-term optimization runs.

Abstract

Optimizing the fuel cycle cost through the optimization of nuclear reactor core loading patterns involves multiple objectives and constraints, leading to a vast number of candidate solutions that cannot be explicitly solved. To advance the state-of-the-art in core reload patterns, we have developed methods based on Deep Reinforcement Learning (DRL) for both single- and multi-objective optimization. Our previous research has laid the groundwork for these approaches and demonstrated their ability to discover high-quality patterns within a reasonable time frame. On the other hand, stochastic optimization (SO) approaches are commonly used in the literature, but there is no rigorous explanation that shows which approach is better in which scenario. In this paper, we demonstrate the advantage of our RL-based approach, specifically using Proximal Policy Optimization (PPO), against the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Nuclear reactor physics and engineering · Global Energy and Sustainability Research

MethodsEntropy Regularization · Proximal Policy Optimization · Focus