From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models

Junlong Tong; Zilong Wang; YuJie Ren; Peiran Yin; Hao Wu; Wei Zhang; Xiaoyu Shen

arXiv:2603.04592·cs.CL·April 21, 2026

From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models

Junlong Tong, Zilong Wang, YuJie Ren, Peiran Yin, Hao Wu, Wei Zhang, Xiaoyu Shen

PDF

1 Repo

TL;DR

This survey reviews the emerging streaming large language models, clarifies their definitions, proposes a taxonomy, discusses methodologies, applications, and future research directions.

Contribution

It provides a unified definition, systematic taxonomy, and comprehensive analysis of streaming LLMs, addressing existing fragmentation in the field.

Findings

01

Established a unified definition based on data flow and interaction.

02

Proposed a systematic taxonomy of streaming LLMs.

03

Discussed applications and future research directions.

Abstract

Standard Large Language Models (LLMs) are predominantly designed for static inference with pre-defined inputs, which limits their applicability in dynamic, real-time scenarios. To address this gap, the streaming LLM paradigm has emerged. However, existing definitions of streaming LLMs remain fragmented, conflating streaming generation, streaming inputs, and interactive streaming architectures, while a systematic taxonomy is still lacking. This paper provides a comprehensive overview and analysis of streaming LLMs. First, we establish a unified definition of streaming LLMs based on data flow and dynamic interaction to clarify existing ambiguities. Building on this definition, we propose a systematic taxonomy of current streaming LLMs and conduct an in-depth discussion on their underlying methodologies. Furthermore, we explore the applications of streaming LLMs in real-world scenarios and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EIT-NLP/Awesome-Streaming-LLMs
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.