Planning and Learning Using Adaptive Entropy Tree Search

Piotr Kozakowski; Miko{\l}aj Pacek; Piotr Mi{\l}o\'s

arXiv:2102.06808·cs.AI·March 16, 2023

Planning and Learning Using Adaptive Entropy Tree Search

Piotr Kozakowski, Miko{\l}aj Pacek, Piotr Mi{\l}o\'s

PDF

Open Access 1 Repo

TL;DR

The paper introduces Adaptive Entropy Tree Search (ANTS), a new planning and learning algorithm that outperforms existing methods like PUCT in Atari benchmarks, demonstrating high performance and robustness.

Contribution

ANTS is a novel algorithm that combines maximum entropy planning with learning, overcoming previous limitations and achieving state-of-the-art results.

Findings

01

ANTS outperforms PUCT in Atari benchmarks

02

ANTS shows robustness to hyperparameter variations

03

ANTS reaches state-of-the-art performance in planning and learning

Abstract

Recent breakthroughs in Artificial Intelligence have shown that the combination of tree-based planning with deep learning can lead to superior performance. We present Adaptive Entropy Tree Search (ANTS) - a novel algorithm combining planning and learning in the maximum entropy paradigm. Through a comprehensive suite of experiments on the Atari benchmark we show that ANTS significantly outperforms PUCT, the planning component of the state-of-the-art AlphaZero system. ANTS builds upon recent work on maximum entropy planning methods - which however, as we show, fail in combination with learning. ANTS resolves this issue to reach state-of-the-art performance. We further find that ANTS exhibits superior robustness to different hyperparameter choices, compared to the previous algorithms. We believe that the high performance and robustness of ANTS can bring tree search planning one step closer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adaptive-entropy-tree-search/ants
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Metaheuristic Optimization Algorithms Research · Reinforcement Learning in Robotics

MethodsSoftmax · AlphaZero