Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

Guinan Su; Yanwu Yang; Xueyan Li; Jonas Geiping

arXiv:2605.12460·cs.LG·May 13, 2026

Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

Guinan Su, Yanwu Yang, Xueyan Li, Jonas Geiping

PDF

1 Repo

TL;DR

This paper proposes multi-stream language models that process parallel streams of thoughts, inputs, and outputs to overcome the limitations of sequential message exchanges in AI agents, enhancing efficiency and usability.

Contribution

It introduces a novel instruction-tuning approach for parallel streams in language models, enabling simultaneous reading, thinking, and acting within a single model pass.

Findings

01

Parallel streams improve model efficiency and responsiveness.

02

Multi-stream models address usability limitations of sequential message formats.

03

Enhanced separation of concerns increases security and monitorability.

Abstract

The continued improvements in language model capability have unlocked their widespread use as drivers of autonomous agents, for example in coding or computer use applications. However, the core of these systems has not changed much since early instruction-tuned models like ChatGPT. Even advanced AI agents function on message exchange formats, successively exchanging messages with users, systems, with itself (i.e. chain-of-thought) and tools in a single stream of computation. This bottleneck to a single stream in chat models leads to a number of limitations: the agent cannot act (generate output) while reading, and in reverse, cannot react to new information while writing. Similarly, the agent cannot act while thinking and cannot think while reading or acting on information. In this work, we show that models can be unblocked by switching from instruction-tuning for sequential message…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seal-rg/streaming
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.