Generative Temporal Difference Learning for Infinite-Horizon Prediction
Michael Janner, Igor Mordatch, Sergey Levine

TL;DR
This paper introduces the $\gamma$-model, a new predictive model for environment dynamics with an infinite horizon, which generalizes existing methods and combines features of model-free and model-based approaches.
Contribution
The paper proposes the $\gamma$-model, a continuous analogue of the successor representation, trained via a generative reinterpretation of temporal difference learning, applicable as both a GAN and normalizing flow.
Findings
The $\gamma$-model effectively predicts long-term environment dynamics.
It demonstrates utility in prediction and control tasks.
The training involves a tradeoff between training-time and testing-time errors.
Abstract
We introduce the -model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with -models leads to generalizations of the procedures central to model-based control, including the model rollout and model-based value estimation. The -model, trained with a generative reinterpretation of temporal difference learning, is a natural continuous analogue of the successor representation and a hybrid between model-free and model-based mechanisms. Like a value function, it contains information about the long-term future; like a standard predictive model, it is independent of task reward. We instantiate the -model as both a generative adversarial network and normalizing flow, discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
