Is Policy Learning Overrated?: Width-Based Planning and Active Learning   for Atari

Benjamin Ayton; Masataro Asai

arXiv:2109.15310·cs.AI·March 22, 2022

Is Policy Learning Overrated?: Width-Based Planning and Active Learning for Atari

Benjamin Ayton, Masataro Asai

PDF

Open Access 1 Repo

TL;DR

This paper introduces Olive, an online active learning method for width-based planning in Atari games, which updates feature representations during planning to improve performance without policy learning.

Contribution

Olive is the first approach to update VAE features online using active learning during planning, significantly improving Atari game performance without policy training.

Findings

01

Olive outperforms Rollout-IW and VAE-IW in 55 Atari games.

02

Olive surpasses policy-learning methods like $ ext{π}$-IW and DQN with less training.

03

Olive achieves state-of-the-art data efficiency in Atari 100k benchmark.

Abstract

Width-based planning has shown promising results on Atari 2600 games using pixel input, while using substantially fewer environment interactions than reinforcement learning. Recent width-based approaches have computed feature vectors for each screen using a hand designed feature set or a variational autoencoder trained on game screens (VAE-IW), and prune screens that do not have novel features during the search. We propose Olive (Online-VAE-IW), which updates the VAE features online using active learning to maximize the utility of screens observed during planning. Experimental results in 55 Atari games demonstrate that it outperforms Rollout-IW by 42-to-11 and VAE-IW by 32-to-20. Moreover, Olive outperforms existing work based on policy-learning ( $π$ -IW, DQN) trained with 100x training budget by 30-to-22 and 31-to-17, and a state of the art data-efficient reinforcement learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ibm/atari-active-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games