State space models can express n-gram languages

Vinoth Nandakumar; Qiang Qu; Peng Mi; Tongliang Liu

arXiv:2306.17184·cs.CL·March 11, 2025

State space models can express n-gram languages

Vinoth Nandakumar, Qiang Qu, Peng Mi, Tongliang Liu

PDF

Open Access

TL;DR

This paper demonstrates that state space models (SSMs) can theoretically and practically encode n-gram language rules, showing their expressiveness and potential advantages over traditional n-gram models in next-word prediction tasks.

Contribution

The paper provides a theoretical framework proving SSMs can encode n-gram rules and shows how their context window can be controlled, bridging the gap between rule-based and neural models.

Findings

01

SSMs can encode n-gram rules using new theoretical results.

02

The spectrum of the state transition matrix controls the context window.

03

Experiments show SSMs can be applied to n-gram generated data.

Abstract

Recent advancements in recurrent neural networks (RNNs) have reinvigorated interest in their application to natural language processing tasks, particularly with the development of more efficient and parallelizable variants known as state space models (SSMs), which have shown competitive performance against transformer models while maintaining a lower memory footprint. While RNNs and SSMs (e.g., Mamba) have been empirically more successful than rule-based systems based on n-gram models, a rigorous theoretical explanation for this success has not yet been developed, as it is unclear how these models encode the combinatorial rules that govern the next-word prediction task. In this paper, we construct state space language models that can solve the next-word prediction task for languages generated from n-gram rules, thereby showing that the former are more expressive. Our proof shows how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification