3D View Prediction Models of the Dorsal Visual Stream
Gabriel Sarch, Hsiao-Yu Fish Tung, Aria Wang, Jacob Prince, and Michael Tarr

TL;DR
This study trains a geometry-aware recurrent neural network to predict 3D views and finds it aligns better with dorsal visual stream activity, highlighting differences in neural representations across visual pathways.
Contribution
Introduces a self-supervised 3D view prediction model that better matches dorsal stream neural responses, contrasting with ventral stream models.
Findings
GRNN better predicts dorsal stream activity
Baseline models align more with ventral regions
Task-specific models reveal differences in visual processing
Abstract
Deep neural network representations align well with brain activity in the ventral visual stream. However, the primate visual system has a distinct dorsal processing stream with different functional properties. To test if a model trained to perceive 3D scene geometry aligns better with neural responses in dorsal visual areas, we trained a self-supervised geometry-aware recurrent neural network (GRNN) to predict novel camera views using a 3D feature memory. We compared GRNN to self-supervised baseline models that have been shown to align well with ventral regions using the large-scale fMRI Natural Scenes Dataset (NSD). We found that while the baseline models accounted better for ventral brain regions, GRNN accounted for a greater proportion of variance in dorsal brain regions. Our findings demonstrate the potential for using task-relevant models to probe representational differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual perception and processing mechanisms · Advanced Vision and Imaging · Visual Attention and Saliency Detection
MethodsALIGN
