The Anxiety of Influence: Bloom Filters in Transformer Attention Heads

Peter Balogh

arXiv:2602.17526·cs.LG·February 20, 2026

The Anxiety of Influence: Bloom Filters in Transformer Attention Heads

Peter Balogh

PDF

Open Access

TL;DR

This paper identifies and analyzes specific attention heads in transformer models that function as Bloom filter-like membership testers, revealing their properties, capacities, and roles in token processing.

Contribution

It uncovers and characterizes genuine membership-testing heads in transformers, demonstrating their multi-resolution, generalization, and coexistence with other computational functions.

Findings

01

Three genuine membership-testing heads exhibit Bloom filter-like behavior.

02

These heads are concentrated in early layers and respond to repeated tokens broadly.

03

Membership heads contribute to processing both repeated and novel tokens.

Abstract

Some transformer attention heads appear to function as membership testers, dedicating themselves to answering the question "has this token appeared before in the context?" We identify these heads across four language models (GPT-2 small, medium, and large; Pythia-160M) and show that they form a spectrum of membership-testing strategies. Two heads (L0H1 and L0H5 in GPT-2 small) function as high-precision membership filters with false positive rates of 0-4\% even at 180 unique context tokens -- well above the $d_{head} = 64$ bit capacity of a classical Bloom filter. A third head (L1H11) shows the classic Bloom filter capacity curve: its false positive rate follows the theoretical formula $p \approx (1 - e^{- k n / m})^{k}$ with $R^{2} = 1.0$ and fitted capacity $m \approx 5$ bits, saturating by $n \approx 20$ unique tokens. A fourth head initially identified as a Bloom filter (L3H0) was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Neurobiology of Language and Bilingualism · Face Recognition and Perception