Feature Visualization Recovers Known Cortical Selectivity from TRIBE v2
Stuart Bladon, Brinnae Bent

TL;DR
This paper introduces feature visualization as a new interpretability method for brain encoder models, revealing known cortical selectivity and hierarchical organization from internal activations.
Contribution
It demonstrates that gradient ascent on encoder activations can recover cortical hierarchies and functional specializations, providing a qualitative interpretability tool.
Findings
Revealed progression of spatial scale and feature complexity from V1 to V4.
Identified distinctive activation patterns for MT, FFA, and PPA regions.
Optimized stimuli for FFA elicit four times more predicted response than natural faces.
Abstract
Brain encoder models predict cortical fMRI responses from the internal activations of pretrained vision and language networks, and are typically evaluated by held-out prediction accuracy. This is a useful signal for training but a poor one for interpretation: it tells us an encoder fits the data without telling us whether it has internalized the functional organization of the brain. We propose feature visualization -- gradient ascent on the encoder's predicted activation for a target region of interest (ROI) -- as a complementary interpretability technique, and apply it to TRIBE v2 composed with V-JEPA 2 (ViT-G, 40 layers), holding both frozen and synthesizing still images for seven regions spanning the ventral and dorsal visual hierarchies. Under identical hyperparameters, the probe recovers a visible progression of increasing spatial scale and feature complexity across V1 to V4,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
