Thinker: Learning to Plan and Act

Stephen Chung; Ivan Anokhin; David Krueger

arXiv:2307.14993·cs.AI·October 30, 2023

Thinker: Learning to Plan and Act

Stephen Chung, Ivan Anokhin, David Krueger

PDF

Open Access 1 Repo 1 Video

TL;DR

Thinker is a reinforcement learning algorithm that enables agents to autonomously learn to plan using a learned world model, achieving state-of-the-art results in complex environments like Sokoban and Atari.

Contribution

It introduces a novel planning approach where RL agents learn to interact with a world model, eliminating the need for handcrafted planning algorithms.

Findings

01

Achieves state-of-the-art in Sokoban

02

Demonstrates effective planning in Atari games

03

Agents learn to interpret and visualize their plans

Abstract

We propose the Thinker algorithm, a novel approach that enables reinforcement learning agents to autonomously interact with and utilize a learned world model. The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model. These model-interaction actions enable agents to perform planning by proposing alternative plans to the world model before selecting a final action to execute in the environment. This approach eliminates the need for handcrafted planning algorithms by enabling the agent to learn how to plan autonomously and allows for easy interpretation of the agent's plan with visualization. We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark, where the Thinker algorithm achieves state-of-the-art performance and competitive results,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stephen-chung-mh/thinker
pytorchOfficial

Videos

Thinker: Learning to Plan and Act· slideslive

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Multi-Agent Systems and Negotiation