Width-based Lookaheads with Learnt Base Policies and Heuristics Over the   Atari-2600 Benchmark

Stefan O'Toole; Nir Lipovetzky; Miquel Ramirez; Adrian Pearce

arXiv:2106.12151·cs.AI·October 29, 2021

Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark

Stefan O'Toole, Nir Lipovetzky, Miquel Ramirez, Adrian Pearce

PDF

Open Access 1 Video

TL;DR

This paper introduces new width-based planning and learning algorithms, notably N-CPL, which outperform previous methods on Atari-2600 games, especially those with large branching factors and sparse rewards.

Contribution

The paper presents novel width-based algorithms inspired by prior analysis, with N-CPL being the most effective, and provides a taxonomy of Atari games for better understanding.

Findings

01

N-CPL outperforms previous algorithms on Atari games.

02

Algorithms perform better on games with large branching factors.

03

Analysis offers insights into game characteristics affecting algorithm performance.

Abstract

We propose new width-based planning and learning algorithms inspired from a careful analysis of the design decisions made by previous width-based planners. The algorithms are applied over the Atari-2600 games and our best performing algorithm, Novelty guided Critical Path Learning (N-CPL), outperforms the previously introduced width-based planning and learning algorithms $π$ -IW(1), $π$ -IW(1)+ and $π$ -HIW(n, 1). Furthermore, we present a taxonomy of the Atari-2600 games according to some of their defining characteristics. This analysis of the games provides further insight into the behaviour and performance of the algorithms introduced. Namely, for games with large branching factors, and games with sparse meaningful rewards, N-CPL outperforms $π$ -IW, $π$ -IW(1)+ and $π$ -HIW(n, 1).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark· slideslive

Taxonomy

TopicsArtificial Intelligence in Games · Evolutionary Algorithms and Applications · Reinforcement Learning in Robotics