Finding Useful Predictions by Meta-gradient Descent to Improve   Decision-making

Alex Kearney; Anna Koop; Johannes G\"unther; Patrick M. Pilarski

arXiv:2111.11212·cs.LG·November 23, 2021

Finding Useful Predictions by Meta-gradient Descent to Improve Decision-making

Alex Kearney, Anna Koop, Johannes G\"unther, Patrick M. Pilarski

PDF

Open Access

TL;DR

This paper introduces a meta-gradient descent method allowing reinforcement learning agents to autonomously select and learn useful predictions, specifically General Value Functions, to improve decision-making in partially observable environments.

Contribution

The work presents a novel meta-gradient approach enabling agents to independently identify and learn valuable predictions without manual specification, enhancing autonomous decision-making.

Findings

01

Agents can independently select predictions that resolve partial observability.

02

The method achieves performance comparable to expert-designed value functions.

03

Self-supervised prediction learning improves decision-making in complex environments.

Abstract

In computational reinforcement learning, a growing body of work seeks to express an agent's model of the world through predictions about future sensations. In this manuscript we focus on predictions expressed as General Value Functions: temporally extended estimates of the accumulation of a future signal. One challenge is determining from the infinitely many predictions that the agent could possibly make which might support decision-making. In this work, we contribute a meta-gradient descent method by which an agent can directly specify what predictions it learns, independent of designer instruction. To that end, we introduce a partially observable domain suited to this investigation. We then demonstrate that through interaction with the environment an agent can independently select predictions that resolve the partial-observability, resulting in performance similar to expertly chosen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Neural dynamics and brain function