Joint rotational invariance and adversarial training of a dual-stream   Transformer yields state of the art Brain-Score for Area V4

William Berrios; Arturo Deza

arXiv:2203.06649·q-bio.NC·October 19, 2022·1 cites

Joint rotational invariance and adversarial training of a dual-stream Transformer yields state of the art Brain-Score for Area V4

William Berrios, Arturo Deza

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that a dual-stream Transformer model, optimized with joint rotational invariance and adversarial training, achieves top performance in predicting brain responses in visual areas, surpassing CNNs in some metrics.

Contribution

It introduces a novel joint optimization approach for Vision Transformers that improves their alignment with human visual brain responses, achieving state-of-the-art Brain-Score results.

Findings

01

Achieved 2nd place in Brain-Score 2022 competition

02

Outperformed ResNet50 in explainable variance for V4, IT, and behavior

03

Joint optimization enhances model robustness and interpretability

Abstract

Modern high-scoring models of vision in the brain score competition do not stem from Vision Transformers. However, in this paper, we provide evidence against the unexpected trend of Vision Transformers (ViT) being not perceptually aligned with human visual representations by showing how a dual-stream Transformer, a CrossViT $a la$ Chen et al. (2021), under a joint rotationally-invariant and adversarial optimization procedure yields 2nd place in the aggregate Brain-Score 2022 competition(Schrimpf et al., 2020b) averaged across all visual categories, and at the time of the competition held 1st place for the highest explainable variance of area V4. In addition, our current Transformer-based model also achieves greater explainable variance for areas V4, IT and Behaviour than a biologically-inspired CNN (ResNet50) that integrates a frontal V1-like computation module (Dapello et…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

williamberrios/BrainScore-Transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception · EEG and Brain-Computer Interfaces · Adversarial Robustness in Machine Learning

MethodsEXP-$Does Expedia refund a cancelled flight? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Concatenated Skip Connection · CrossViT · Dropout · Dense Connections · Residual Connection · Layer Normalization