A polar prediction model for learning to represent visual   transformations

Pierre-\'Etienne H. Fiquet; Eero P. Simoncelli

arXiv:2303.03432·stat.ML·November 5, 2024·1 cites

A polar prediction model for learning to represent visual transformations

Pierre-\'Etienne H. Fiquet, Eero P. Simoncelli

PDF

Open Access

TL;DR

This paper introduces a polar prediction model for visual transformations that leverages Fourier principles and group theory to improve temporal prediction in videos, offering interpretability and biological relevance.

Contribution

It proposes a novel polar architecture for self-supervised learning of visual transformations, grounded in Fourier shift theorem, achieving better prediction and biological plausibility.

Findings

01

Outperforms traditional motion compensation in prediction accuracy

02

Rivals conventional deep networks in prediction performance

03

Resembles primate V1 neuron models in computation structure

Abstract

All organisms make temporal predictions, and their evolutionary fitness level depends on the accuracy of these predictions. In the context of visual perception, the motions of both the observer and objects in the scene structure the dynamics of sensory signals, allowing for partial prediction of future signals based on past ones. Here, we propose a self-supervised representation-learning framework that extracts and exploits the regularities of natural videos to compute accurate predictions. We motivate the polar architecture by appealing to the Fourier shift theorem and its group-theoretic generalization, and we optimize its parameters on next-frame prediction. Through controlled experiments, we demonstrate that this approach can discover the representation of simple transformation groups acting in data. When trained on natural video datasets, our framework achieves better prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Neural dynamics and brain function · Cell Image Analysis Techniques