TL;DR
This paper introduces a novel in silico, data-driven method using transformer models to map and understand complex visual category selectivity across the entire human brain, surpassing traditional linear approaches.
Contribution
It presents a transformer-based encoder-decoder model with cross-attention for nonlinear brain-to-image feature mapping and a new method to synthesize images that activate specific brain regions.
Findings
Revealed complex, compositional selectivity in brain regions.
Validated in silico predictions across multiple subjects.
Provided a framework for hypothesis generation in visual neuroscience.
Abstract
A fine-grained account of functional selectivity in the cortex is essential for understanding how visual information is processed and represented in the brain. Classical studies using designed experiments have identified multiple category-selective regions; however, these approaches rely on preconceived hypotheses about categories. Subsequent data-driven discovery methods have sought to address this limitation but are often limited by simple, typically linear encoding models. We propose an in silico approach for data-driven discovery of novel category-selectivity hypotheses based on an encoder-decoder transformer model. The architecture incorporates a brain-region to image-feature cross-attention mechanism, enabling nonlinear mappings between high-dimensional deep network features and semantic patterns encoded in the brain activity. We further introduce a method to characterize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
