Generative Temporal Difference Learning for Infinite-Horizon Prediction

Michael Janner; Igor Mordatch; Sergey Levine

arXiv:2010.14496·cs.LG·November 30, 2021

Generative Temporal Difference Learning for Infinite-Horizon Prediction

Michael Janner, Igor Mordatch, Sergey Levine

PDF

Open Access 1 Repo

TL;DR

This paper introduces the $\gamma$-model, a new predictive model for environment dynamics with an infinite horizon, which generalizes existing methods and combines features of model-free and model-based approaches.

Contribution

The paper proposes the $\gamma$-model, a continuous analogue of the successor representation, trained via a generative reinterpretation of temporal difference learning, applicable as both a GAN and normalizing flow.

Findings

01

The $\gamma$-model effectively predicts long-term environment dynamics.

02

It demonstrates utility in prediction and control tasks.

03

The training involves a tradeoff between training-time and testing-time errors.

Abstract

We introduce the $γ$ -model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with $γ$ -models leads to generalizations of the procedures central to model-based control, including the model rollout and model-based value estimation. The $γ$ -model, trained with a generative reinterpretation of temporal difference learning, is a natural continuous analogue of the successor representation and a hybrid between model-free and model-based mechanisms. Like a value function, it contains information about the long-term future; like a standard predictive model, it is independent of task reward. We instantiate the $γ$ -model as both a generative adversarial network and normalizing flow, discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JannerM/gamma-models
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks