SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy
RJ Skerry-Ryan, Julian Salazar, Soroosh Mariooryad, David Kao, Daisy Stanton, Eric Battenberg, Matt Shannon, Ron J. Weiss, Robin Scheibler, Jonas Rothfuss, Tom Bagby

TL;DR
SequenceLayers introduces an API and library for sequence modeling that simplifies building streamable, correct, and flexible neural network models capable of both layer-wise and step-by-step execution.
Contribution
It provides a novel layer API with explicit state representation and step methods, enabling easy creation of streamable and correct sequence models across deep learning frameworks.
Findings
Enables models to be executed both layer-by-layer and step-by-step.
Ensures identical results between different execution modes.
Available implementations in JAX and TensorFlow 2.
Abstract
We introduce a neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed both layer-by-layer (e.g., teacher-forced training) and step-by-step (e.g., autoregressive sampling). To achieve this, layers define an explicit representation of their state over time (e.g., a Transformer KV cache, a convolution buffer, an RNN hidden state), and a step method that evolves that state, tested to give identical results to a stateless layer-wise invocation. This and other aspects of the SequenceLayers contract enables complex models to be immediately streamable, mitigates a wide range of common bugs arising in both streaming and parallel sequence processing, and can be implemented in any deep learning library. A composable and declarative API, along with a comprehensive suite of layers and combinators, streamlines the construction of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
