On multi-view feature learning
Roland Memisevic (University of Frankfurt)

TL;DR
This paper analyzes multi-view feature learning, revealing how hidden variables encode transformations like rotations, and explaining the emergence of transformation-specific and invariant features in multi-observation data.
Contribution
It provides a theoretical analysis of multi-view feature learning, connecting hidden variables to transformations and explaining recent experimental observations.
Findings
Hidden variables encode transformations such as rotations.
Transformation-specific features emerge during training on videos.
Transformation-invariant features can arise as a by-product of learning transformations.
Abstract
Sparse coding is a common approach to learning local features for object recognition. Recently, there has been an increasing interest in learning features from spatio-temporal, binocular, or other multi-observation data, where the goal is to encode the relationship between images rather than the content of a single image. We provide an analysis of multi-view feature learning, which shows that hidden variables encode transformations by detecting rotation angles in the eigenspaces shared among multiple image warps. Our analysis helps explain recent experimental results showing that transformation-specific features emerge when training complex cell models on videos. Our analysis also shows that transformation-invariant features can emerge as a by-product of learning representations of transformations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Neural Networks and Applications
