A polar prediction model for learning to represent visual transformations
Pierre-\'Etienne H. Fiquet, Eero P. Simoncelli

TL;DR
This paper introduces a polar prediction model for visual transformations that leverages Fourier principles and group theory to improve temporal prediction in videos, offering interpretability and biological relevance.
Contribution
It proposes a novel polar architecture for self-supervised learning of visual transformations, grounded in Fourier shift theorem, achieving better prediction and biological plausibility.
Findings
Outperforms traditional motion compensation in prediction accuracy
Rivals conventional deep networks in prediction performance
Resembles primate V1 neuron models in computation structure
Abstract
All organisms make temporal predictions, and their evolutionary fitness level depends on the accuracy of these predictions. In the context of visual perception, the motions of both the observer and objects in the scene structure the dynamics of sensory signals, allowing for partial prediction of future signals based on past ones. Here, we propose a self-supervised representation-learning framework that extracts and exploits the regularities of natural videos to compute accurate predictions. We motivate the polar architecture by appealing to the Fourier shift theorem and its group-theoretic generalization, and we optimize its parameters on next-frame prediction. Through controlled experiments, we demonstrate that this approach can discover the representation of simple transformation groups acting in data. When trained on natural video datasets, our framework achieves better prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Neural dynamics and brain function · Cell Image Analysis Techniques
