Where to Bind Matters: Hebbian Fast Weights in Vision Transformers for Few-Shot Character Recognition

Gavin Money; Sindhuja Penchala; Jiacheng Li; Noorbakhsh Amiri Golilarz

arXiv:2605.02920·cs.NE·May 6, 2026

Where to Bind Matters: Hebbian Fast Weights in Vision Transformers for Few-Shot Character Recognition

Gavin Money, Sindhuja Penchala, Jiacheng Li, Noorbakhsh Amiri Golilarz

PDF

TL;DR

This paper explores integrating Hebbian fast weights into vision transformers to enable rapid adaptation for few-shot character recognition, achieving state-of-the-art accuracy on Omniglot.

Contribution

It introduces a novel module placement strategy for Hebbian fast weights in vision transformers, improving few-shot learning performance and stability.

Findings

01

Hebbian modules placed at the final stage yield highest accuracy.

02

Swin-Hebbian achieves 96.2% at 1-shot and 99.2% at 5-shot.

03

Per-block placement causes training instability in ViT and DeiT.

Abstract

Standard transformer architectures learn fixed slow-weight representations during training and lack mechanisms for rapid adaptation within an episode. In contrast, biological neural systems address this through fast synaptic updates that form transient associative memories during inference, a property known as Hebbian plasticity. In this paper, we conduct an empirical study of Hebbian Fast-Weight (HFW) modules integrated into multiple transformer backbones, including ViT-Small, DeiT-Small, and Swin-Tiny. We evaluate six model variants: ViT, DeiT, Swin, ViT-Hebbian, DeiT-Hebbian, and Swin-Hebbian on 5-way 1-shot and 5-way 5-shot classification tasks using the Omniglot benchmark under a Prototypical Network meta-learning framework. We propose a single module placement strategy for Swin-Tiny in which one HFW module is applied to the final stage feature map after all hierarchical stages…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.