TL;DR
This study uncovers two distinct cortical pathways during scene perception, one for scene layout and context, and another for animate content, using fMRI data and neural network comparisons.
Contribution
It reveals a two-route cortical organization during scene perception, linking brain activity to hierarchical features in vision and language models.
Findings
Two distinct processing routes identified: ventromedial for scene layout, lateral for animate content.
Vision models align with shared brain structure in both routes.
Language models mainly align with the lateral pathway.
Abstract
The brain transforms visual inputs into high-dimensional cortical representations that support diverse cognitive and behavioral goals. Characterizing how this information is organized and routed across the human brain is essential for understanding how we process complex visual scenes. Here, we applied representational similarity analysis to 7T fMRI data collected during natural scene viewing. We quantified representational geometry shared across individuals and compared it to hierarchical features from vision and language neural networks. This analysis revealed two distinct processing routes: a ventromedial pathway specialized for scene layout and environmental context, and a lateral occipitotemporal pathway selective for animate content. Vision models aligned with shared structure in both routes, whereas language models corresponded primarily with the lateral pathway. These findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
