Unsupervised Representation Learning from Sparse Transformation Analysis
Yue Song, Thomas Anderson Keller, Yisong Yue, Pietro Perona, Max Welling

TL;DR
This paper introduces an unsupervised method for learning disentangled and approximately equivariant representations from sequence data by factorizing transformations into sparse, independent flow fields, improving data likelihood and transformation understanding.
Contribution
It proposes a novel unsupervised model that decomposes transformation flows into sparse, independent components, enabling learning of equivariant representations from sequence data.
Findings
Achieves state-of-the-art data likelihood on sequence datasets.
Demonstrates effective learning of approximately equivariant representations.
Models transformation primitives as sparse, independent flow fields.
Abstract
There is a vast literature on representation learning based on principles such as coding efficiency, statistical independence, causality, controllability, or symmetry. In this paper we propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components. Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model, before being decoded to predict a future input state. The flow model is decomposed into a number of rotational (divergence-free) vector fields and a number of potential flow (curl-free) fields. Our sparsity prior encourages only a small number of these fields to be active at any instant and infers the speed with which the probability flows along these fields. Training this model is completely unsupervised using a standard variational objective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
