Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration

Sibo Xiao; Zixin Lin; Wenyang Gao; Hui Chen; Yue Zhang

arXiv:2505.20625·cs.CL·September 30, 2025

Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration

Sibo Xiao, Zixin Lin, Wenyang Gao, Hui Chen, Yue Zhang

PDF

Open Access

TL;DR

This paper introduces XpandA, a multi-agent framework with question-driven workflow and dynamic partitioning, significantly improving long-context processing in large language models by reducing latency and information loss.

Contribution

The paper presents a novel multi-agent approach with dynamic partitioning and question-guided knowledge sharing to enhance long-context processing in LLMs, addressing limitations of existing methods.

Findings

01

Achieves 20% performance improvement on long-context benchmarks.

02

Provides 1.5x inference speedup over baseline methods.

03

Effectively processes sequences up to 1 million tokens.

Abstract

Processing long contexts has become a critical capability for modern large language models (LLMs). Existing works leverage agent-based divide-and-conquer methods for processing long contexts. But these methods face crucial limitations, including prohibitive accumulated latency and amplified information loss from excessive agent invocations, and the disruption of inherent textual dependencies by immoderate partitioning. In this paper, we propose a novel multi-agent framework XpandA (Expand-Agent) coupled with question-driven workflow and dynamic partitioning for robust long-context processing. XpandA overcomes these limitations through: 1) dynamic partitioning of long texts, which adaptively modulates the filling rate of context windows for input sequences of vastly varying lengths; 2) question-guided protocol to update flat information ensembles within centralized shared memory,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Geographic Information Systems Studies

MethodsByte Pair Encoding · Linear Layer · Attention Is All You Need · WordPiece · Multi-Head Attention · BART · Softmax · Layer Normalization · Adam · Linear Warmup With Linear Decay