TL;DR
This paper introduces a joint learning framework combining vector autoregressive models and Variational Autoencoders to effectively model visual sequences, capturing both linear and non-linear dynamics.
Contribution
It presents a novel architecture that enables Variational Autoencoders to learn linear Gaussian representations alongside non-linear observations from visual sequences.
Findings
Effective modeling of artificial sequences demonstrated.
Successful application to dynamic textures.
Joint learning improves representation quality.
Abstract
This work studies the problem of modeling visual processes by leveraging deep generative architectures for learning linear, Gaussian representations from observed sequences. We propose a joint learning framework, combining a vector autoregressive model and Variational Autoencoders. This results in an architecture that allows Variational Autoencoders to simultaneously learn a non-linear observation as well as a linear state model from sequences of frames. We validate our approach on artificial sequences and dynamic textures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
