A recurrent vision transformer shows signatures of primate visual attention
Jonathan Morgan, Badr Albanna, James P. Herman

TL;DR
This paper introduces a Recurrent Vision Transformer that combines self-attention with recurrent memory, capturing key features of primate visual attention and demonstrating primate-like attention signatures in a computational model.
Contribution
The paper presents a novel Recurrent Vision Transformer integrating recurrent memory with self-attention, trained on a primate-inspired task, showing primate-like attention behaviors and neural signatures.
Findings
Model exhibits primate-like attention signatures
Attention maps show dynamic spatial prioritization
Perturbations mimic primate neural responses
Abstract
Attention is fundamental to both biological and artificial intelligence, yet research on animal attention and AI self attention remains largely disconnected. We propose a Recurrent Vision Transformer (Recurrent ViT) that integrates self-attention with recurrent memory, allowing both current inputs and stored information to guide attention allocation. Trained solely via sparse reward feedback on a spatially cued orientation change detection task, a paradigm used in primate studies, our model exhibits primate like signatures of attention, including improved accuracy and faster responses for cued stimuli that scale with cue validity. Analysis of self-attention maps reveals dynamic spatial prioritization with reactivation prior to expected changes, and targeted perturbations produce performance shifts similar to those observed in primate frontal eye fields and superior colliculus. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function · Memory and Neural Mechanisms · Visual perception and processing mechanisms
MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Residual Connection · Linear Layer · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax
