Stabilized Nested Rollout Policy Adaptation

Tristan Cazenave; Jean-Baptiste Sevestre; Matthieu Toulemont

arXiv:2101.03563·cs.AI·January 12, 2021

Stabilized Nested Rollout Policy Adaptation

Tristan Cazenave, Jean-Baptiste Sevestre, Matthieu Toulemont

PDF

TL;DR

This paper introduces a modified version of the Nested Rollout Policy Adaptation algorithm to enhance its stability, demonstrating improved performance across various single-player game domains.

Contribution

The paper proposes a stability-enhanced variant of NRPA, addressing its limitations and improving its robustness in diverse applications.

Findings

01

Improved stability in NRPA across multiple domains

02

Enhanced performance in single-player game tasks

03

Demonstrated effectiveness in real-world applications

Abstract

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.