Lower Bounds on the Expressivity of Recurrent Neural Language Models

Anej Svete; Franz Nowak; Anisha Mohamed Sahabdeen; Ryan Cotterell

arXiv:2405.19222·cs.CL·June 19, 2024

Lower Bounds on the Expressivity of Recurrent Neural Language Models

Anej Svete, Franz Nowak, Anisha Mohamed Sahabdeen, Ryan Cotterell

PDF

1 Repo

TL;DR

This paper investigates the expressive power of recurrent neural language models, showing they can represent complex probabilistic regular languages under certain conditions, advancing understanding of their computational capabilities.

Contribution

It establishes a connection between RNN language models and probabilistic finite automata, demonstrating their capacity to express arbitrary regular probabilistic languages.

Findings

01

RNN LMs with linearly bounded precision can express all regular probabilistic languages.

02

The study bridges the gap between recognition capabilities and distribution modeling in RNNs.

03

Provides theoretical bounds on the expressivity of RNN-based language models.

Abstract

The recent successes and spread of large neural language models (LMs) call for a thorough understanding of their computational ability. Describing their computational abilities through LMs' \emph{representational capacity} is a lively area of research. However, investigation into the representational capacity of neural LMs has predominantly focused on their ability to \emph{recognize} formal languages. For example, recurrent neural networks (RNNs) with Heaviside activations are tightly linked to regular languages, i.e., languages defined by finite-state automata (FSAs). Such results, however, fall short of describing the capabilities of RNN \emph{language models} (LMs), which are definitionally \emph{distributions} over strings. We take a fresh look at the representational capacity of RNN LMs by connecting them to \emph{probabilistic} FSAs and demonstrate that RNN LMs with linearly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rycolab/nondeterministic-rnns
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.