A recurrent vision transformer shows signatures of primate visual   attention

Jonathan Morgan; Badr Albanna; James P. Herman

arXiv:2502.10955·cs.CV·February 18, 2025

A recurrent vision transformer shows signatures of primate visual attention

Jonathan Morgan, Badr Albanna, James P. Herman

PDF

Open Access

TL;DR

This paper introduces a Recurrent Vision Transformer that combines self-attention with recurrent memory, capturing key features of primate visual attention and demonstrating primate-like attention signatures in a computational model.

Contribution

The paper presents a novel Recurrent Vision Transformer integrating recurrent memory with self-attention, trained on a primate-inspired task, showing primate-like attention behaviors and neural signatures.

Findings

01

Model exhibits primate-like attention signatures

02

Attention maps show dynamic spatial prioritization

03

Perturbations mimic primate neural responses

Abstract

Attention is fundamental to both biological and artificial intelligence, yet research on animal attention and AI self attention remains largely disconnected. We propose a Recurrent Vision Transformer (Recurrent ViT) that integrates self-attention with recurrent memory, allowing both current inputs and stored information to guide attention allocation. Trained solely via sparse reward feedback on a spatially cued orientation change detection task, a paradigm used in primate studies, our model exhibits primate like signatures of attention, including improved accuracy and faster responses for cued stimuli that scale with cue validity. Analysis of self-attention maps reveals dynamic spatial prioritization with reactivation prior to expected changes, and targeted perturbations produce performance shifts similar to those observed in primate frontal eye fields and superior colliculus. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural dynamics and brain function · Memory and Neural Mechanisms · Visual perception and processing mechanisms

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Residual Connection · Linear Layer · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax