World Value Functions: Knowledge Representation for Multitask   Reinforcement Learning

Geraud Nangue Tasse; Steven James; Benjamin Rosman

arXiv:2205.08827·cs.LG·May 19, 2022

World Value Functions: Knowledge Representation for Multitask Reinforcement Learning

Geraud Nangue Tasse, Steven James, Benjamin Rosman

PDF

Open Access

TL;DR

This paper introduces world value functions (WVFs), a novel knowledge representation enabling agents to learn, infer, and reuse solutions for multiple tasks in a given environment, supporting zero-shot generalization and planning.

Contribution

The work proposes WVFs as a unified framework for representing and learning knowledge that generalizes across tasks, allowing inference of policies and values for new tasks and enabling zero-shot skill composition.

Findings

01

WVFs can infer transition functions from learned values.

02

Pretrained WVFs enable direct inference of policies for new tasks.

03

Agents can solve combined tasks zero-shot using WVFs.

Abstract

An open problem in artificial intelligence is how to learn and represent knowledge that is sufficient for a general agent that needs to solve multiple tasks in a given world. In this work we propose world value functions (WVFs), which are a type of general value function with mastery of the world - they represent not only how to solve a given task, but also how to solve any other goal-reaching task. To achieve this, we equip the agent with an internal goal space defined as all the world states where it experiences a terminal transition - a task outcome. The agent can then modify task rewards to define its own reward function, which provably drives it to learn how to achieve all achievable internal goals, and the value of doing so in the current task. We demonstrate a number of benefits of WVFs. When the agent's internal goal space is the entire state space, we demonstrate that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)