Predictive Coding for Dynamic Vision : Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model
Minkyu Choi, Jun Tani

TL;DR
This paper introduces P-MSTRNN, a novel recurrent neural network that learns and recognizes dynamic visual patterns through a hierarchical spatio-temporal structure, advancing predictive coding in vision systems.
Contribution
The paper develops a new multi-scale RNN model that captures hierarchical spatio-temporal features for dynamic vision, demonstrating its effectiveness in movement pattern recognition and generation.
Findings
The model successfully learns hierarchical spatio-temporal representations.
It achieves robust recognition and generation of movement patterns.
Hierarchical neural activity develops through learning from exemplars.
Abstract
The current paper presents a novel recurrent neural network model, the predictive multiple spatio-temporal scales RNN (P-MSTRNN), which can generate as well as recognize dynamic visual patterns in the predictive coding framework. The model is characterized by multiple spatio-temporal scales imposed on neural unit dynamics through which an adequate spatio-temporal hierarchy develops via learning from exemplars. The model was evaluated by conducting an experiment of learning a set of whole body human movement patterns which was generated by following a hierarchically defined movement syntax. The analysis of the trained model clarifies what types of spatio-temporal hierarchy develop in dynamic neural activity as well as how robust generation and recognition of movement patterns can be achieved by using the error minimization principle.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
