pLSTM: parallelizable Linear Source Transition Mark networks

Korbinian P\"oppel; Richard Freinschlag; Thomas Schmied; Wei Lin; Sepp Hochreiter

arXiv:2506.11997·cs.LG·June 16, 2025

pLSTM: parallelizable Linear Source Transition Mark networks

Korbinian P\"oppel, Richard Freinschlag, Thomas Schmied, Wei Lin, Sepp Hochreiter

PDF

Open Access 1 Datasets 1 Video

TL;DR

The paper introduces pLSTM, a parallelizable linear RNN architecture designed for processing complex data structures like DAGs and grids, enabling efficient long-range information propagation and outperforming Transformers on certain tasks.

Contribution

It extends linear RNNs to handle multi-dimensional and graph-structured data with parallelization, addressing long-distance dependencies effectively.

Findings

01

pLSTM generalizes well to larger images, outperforming Transformers.

02

pLSTM effectively handles long-range dependencies in DAGs.

03

Strong performance on molecular graph and computer vision benchmarks.

Abstract

Modern recurrent architectures, such as xLSTM and Mamba, have recently challenged the Transformer in language modeling. However, their structure constrains their applicability to sequences only or requires processing multi-dimensional data structures, such as images or molecular graphs, in a pre-defined sequential order. In contrast, Multi-Dimensional RNNs (MDRNNs) are well suited for data with a higher level structure, like 2D grids, trees, and directed acyclic graphs (DAGs). In this work, we extend the notion of multi-dimensionality to linear RNNs. We introduce parallelizable Linear Source Transition Mark networks (pLSTMs) using Source, Transition, and Mark gates that act on the line graph of a general DAG. This enables parallelization in analogy to parallel associative scans and the chunkwise-recurrent form of sequential linear RNNs, but for DAGs. For regular grids (1D and 2D), like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ml-jku/arrow_pointing_extrapolation
dataset· 15 dl
15 dl

Videos

pLSTM: parallelizable Linear Source Transition Mark networks· slideslive

Taxonomy

TopicsPower Systems and Technologies · Algorithms and Data Compression · Parallel Computing and Optimization Techniques