Unsupervised Feature Learning from Temporal Data
Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

TL;DR
This paper proposes an unsupervised method for learning features from video data by leveraging temporal coherence, using a convolutional auto-encoder regularized by slowness and sparsity to improve semantic and temporal consistency.
Contribution
It introduces a novel unsupervised feature learning approach from unlabeled video data that connects slow feature learning to metric learning for better coherence.
Findings
The trained encoder defines a more coherent metric in terms of temporal and semantic similarity.
The method effectively captures semantic information without labeled data.
It demonstrates improved feature coherence in video data.
Abstract
Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity. We establish a connection between slow feature learning to metric learning and show that the trained encoder can be used to define a more temporally and semantically coherent metric.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis
