General Value Function Networks

Matthew Schlegel; Andrew Jacobsen; Zaheer Abbas; Andrew Patterson,; Adam White; and Martha White

arXiv:1807.06763·cs.LG·February 3, 2021

General Value Function Networks

Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson,, Adam White, and Martha White

PDF

TL;DR

This paper introduces General Value Function Networks (GVFNs), a new RNN architecture that uses future predictions as internal states, improving training robustness and incorporating domain knowledge for better learning in partially observable environments.

Contribution

The paper proposes GVFNs, an innovative RNN design where internal states are predictions of the future, enhancing trainability and robustness compared to traditional RNNs.

Findings

01

GVFNs are more robust to truncation levels in backpropagation.

02

GVFNs often require only one-step gradient updates for training.

03

Incorporating future predictions improves RNN stability and performance.

Abstract

State construction is important for learning in partially observable environments. A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation. This internal state provides a summary of the observed sequence, to facilitate accurate predictions and decision-making. At the same time, specifying and training RNNs is notoriously tricky, particularly as the common strategy to approximate gradients back in time, called truncated Back-prop Through Time (BPTT), can be sensitive to the truncation window. Further, domain-expertise--which can usually help constrain the function class and so improve trainability--can be difficult to incorporate into complex recurrent units used within RNNs. In this work, we explore how to use multi-step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.