Generalized Nested Rollout Policy Adaptation with Limited Repetitions
Tristan Cazenave

TL;DR
This paper introduces an improved Monte Carlo search algorithm, GNRPA, which limits repetitions of the best sequence to enhance exploration and performance across various combinatorial problems.
Contribution
The paper proposes a novel modification to GNRPA that prevents overly deterministic policies by limiting repetitions, leading to better results in multiple combinatorial tasks.
Findings
Improved performance on Inverse RNA Folding
Enhanced solutions for Traveling Salesman Problem with Time Windows
Better results on the Weak Schur problem
Abstract
Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. We propose to improve on GNRPA by avoiding too deterministic policies that find again and again the same sequence of choices. We do so by limiting the number of repetitions of the best sequence found at a given level. Experiments show that it improves the algorithm for three different combinatorial problems: Inverse RNA Folding, the Traveling Salesman Problem with Time Windows and the Weak Schur problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Machine Learning and Algorithms · Advanced Multi-Objective Optimization Algorithms
