Consistent Generative Query Networks
Ananya Kumar, S. M. Ali Eslami, Danilo J. Rezende, Marta Garnelo,, Fabio Viola, Edward Lockhart, Murray Shanahan

TL;DR
This paper introduces a novel generative model that efficiently produces temporally consistent video frames at arbitrary time points, overcoming the limitations of autoregressive methods in speed and flexibility.
Contribution
The authors propose a model that generates a latent representation from any set of frames, enabling fast, simultaneous sampling of consistent frames at arbitrary times, including jump-ahead capabilities.
Findings
Significant speed improvements over autoregressive models.
Ability to sample frames at arbitrary time points, including future and past.
Effective in both synthetic video prediction and 3D scene reconstruction tasks.
Abstract
Stochastic video prediction models take in a sequence of image frames, and generate a sequence of consecutive future image frames. These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive. We introduce a model that overcomes these drawbacks by generating a latent representation from an arbitrary set of frames that can then be used to simultaneously and efficiently sample temporally consistent frames at arbitrary time-points. For example, our model can "jump" and directly sample frames at the end of the video, without sampling intermediate frames. Synthetic video evaluations confirm substantial gains in speed and functionality without loss in fidelity. We also apply our framework to a 3D scene reconstruction dataset. Here, our model is conditioned on camera location and can sample consistent sets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Database Systems and Queries · Data Management and Algorithms
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
