Dissecting Linear Recurrent Models: How Different Gating Strategies Drive Selectivity and Generalization

Younes Bouhadjar; Maxime Fabre; Felix Schmidt; Emre Neftci

arXiv:2601.12598·cs.LG·January 21, 2026

Dissecting Linear Recurrent Models: How Different Gating Strategies Drive Selectivity and Generalization

Younes Bouhadjar, Maxime Fabre, Felix Schmidt, Emre Neftci

PDF

Open Access

TL;DR

This paper introduces SelectivBench, a lightweight synthetic benchmark for evaluating linear recurrent models' ability to focus on relevant inputs, revealing how gating and forgetting mechanisms influence selectivity and generalization.

Contribution

It proposes a refined taxonomy of linear recurrent models and provides a new benchmark, SelectivBench, for systematic evaluation of their selectivity and generalization capabilities.

Findings

01

Gating and rapid forgetting mechanisms facilitate recall.

02

In-state channel mixing is unnecessary for selectivity but critical for generalization.

03

Softmax attention remains dominant due to its memory capacity scaling.

Abstract

Linear recurrent neural networks have emerged as efficient alternatives to the original Transformer's softmax attention mechanism, thanks to their highly parallelizable training and constant memory and computation requirements at inference. Iterative refinements of these models have introduced an increasing number of architectural mechanisms, leading to increased complexity and computational costs. Nevertheless, systematic direct comparisons among these models remain limited. Existing benchmark tasks are either too simplistic to reveal substantial differences or excessively resource-intensive for experimentation. In this work, we propose a refined taxonomy of linear recurrent models and introduce SelectivBench, a set of lightweight and customizable synthetic benchmark tasks for systematically evaluating sequence models. SelectivBench specifically evaluates selectivity in sequence models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications