Stabilized Nested Rollout Policy Adaptation
Tristan Cazenave, Jean-Baptiste Sevestre, Matthieu Toulemont

TL;DR
This paper introduces a modified version of the Nested Rollout Policy Adaptation algorithm to enhance its stability, demonstrating improved performance across various single-player game domains.
Contribution
The paper proposes a stability-enhanced variant of NRPA, addressing its limitations and improving its robustness in diverse applications.
Findings
Improved stability in NRPA across multiple domains
Enhanced performance in single-player game tasks
Demonstrated effectiveness in real-world applications
Abstract
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
