RAM-Net: Expressive Linear Attention with Selectively Addressable Memory

Kaicheng Xiao; Haotian Li; Liran Dong; Guoliang Xing

arXiv:2602.11958·cs.LG·February 13, 2026

RAM-Net: Expressive Linear Attention with Selectively Addressable Memory

Kaicheng Xiao, Haotian Li, Liran Dong, Guoliang Xing

PDF

Open Access

TL;DR

RAM-Net introduces a high-dimensional sparse memory architecture with selective addressing, enabling exponential scaling of state size and improved expressivity in linear attention models, while maintaining computational efficiency.

Contribution

The paper presents RAM-Net, a novel linear attention architecture with explicit sparse memory addressing, bridging the gap between expressivity and efficiency in sequence modeling.

Findings

01

Outperforms state-of-the-art in long-range retrieval tasks

02

Achieves competitive results in language modeling benchmarks

03

Demonstrates superior dependency capturing with reduced computation

Abstract

While linear attention architectures offer efficient inference, compressing unbounded history into a fixed-size memory inherently limits expressivity and causes information loss. To address this limitation, we introduce Random Access Memory Network (RAM-Net), a novel architecture designed to bridge the gap between the representational capacity of full attention and the memory efficiency of linear models. The core of RAM-Net maps inputs to high-dimensional sparse vectors serving as explicit addresses, allowing the model to selectively access a massive memory state. This design enables exponential state size scaling without additional parameters, which significantly mitigates signal interference and enhances retrieval fidelity. Moreover, the inherent sparsity ensures exceptional computational efficiency, as state updates are confined to minimal entries. Extensive experiments demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning