Finding Useful Predictions by Meta-gradient Descent to Improve Decision-making
Alex Kearney, Anna Koop, Johannes G\"unther, Patrick M. Pilarski

TL;DR
This paper introduces a meta-gradient descent method allowing reinforcement learning agents to autonomously select and learn useful predictions, specifically General Value Functions, to improve decision-making in partially observable environments.
Contribution
The work presents a novel meta-gradient approach enabling agents to independently identify and learn valuable predictions without manual specification, enhancing autonomous decision-making.
Findings
Agents can independently select predictions that resolve partial observability.
The method achieves performance comparable to expert-designed value functions.
Self-supervised prediction learning improves decision-making in complex environments.
Abstract
In computational reinforcement learning, a growing body of work seeks to express an agent's model of the world through predictions about future sensations. In this manuscript we focus on predictions expressed as General Value Functions: temporally extended estimates of the accumulation of a future signal. One challenge is determining from the infinitely many predictions that the agent could possibly make which might support decision-making. In this work, we contribute a meta-gradient descent method by which an agent can directly specify what predictions it learns, independent of designer instruction. To that end, we introduce a partially observable domain suited to this investigation. We then demonstrate that through interaction with the environment an agent can independently select predictions that resolve the partial-observability, resulting in performance similar to expertly chosen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Neural dynamics and brain function
