World Model as a Graph: Learning Latent Landmarks for Planning

Lunjun Zhang; Ge Yang; Bradly C. Stadie

arXiv:2011.12491·cs.AI·July 2, 2021·22 cites

World Model as a Graph: Learning Latent Landmarks for Planning

Lunjun Zhang, Ge Yang, Bradly C. Stadie

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces L3P, a novel graph-structured world model with latent landmarks for long-horizon planning in complex environments, combining model-based and model-free RL advantages.

Contribution

It proposes a new method to learn graph-based world models with latent landmarks and reachability estimates, enhancing planning capabilities in high-dimensional tasks.

Findings

01

L3P outperforms prior methods on various control tasks.

02

L3P effectively combines robustness of model-free RL with graph search generalization.

03

L3P enables scalable long-horizon planning in complex environments.

Abstract

Planning - the ability to analyze the structure of a problem in the large and decompose it into interrelated subproblems - is a hallmark of human intelligence. While deep reinforcement learning (RL) has shown great promise for solving relatively straightforward control tasks, it remains an open problem how to best incorporate planning into existing deep RL paradigms to handle increasingly complex environments. One prominent framework, Model-Based RL, learns a world model and plans using step-by-step virtual rollouts. This type of world model quickly diverges from reality when the planning horizon increases, thus struggling at long-horizon planning. How can we learn world models that endow agents with the ability to do temporally extended reasoning? In this work, we propose to learn graph-structured world models composed of sparse, multi-step transitions. We devise a novel algorithm to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LunjunZhang/world-model-as-a-graph
pytorchOfficial

Videos

World Model as a Graph: Learning Latent Landmarks for Planning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · AI-based Problem Solving and Planning · Artificial Intelligence in Games