Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration
Sibo Xiao, Zixin Lin, Wenyang Gao, Hui Chen, Yue Zhang

TL;DR
This paper introduces XpandA, a multi-agent framework with question-driven workflow and dynamic partitioning, significantly improving long-context processing in large language models by reducing latency and information loss.
Contribution
The paper presents a novel multi-agent approach with dynamic partitioning and question-guided knowledge sharing to enhance long-context processing in LLMs, addressing limitations of existing methods.
Findings
Achieves 20% performance improvement on long-context benchmarks.
Provides 1.5x inference speedup over baseline methods.
Effectively processes sequences up to 1 million tokens.
Abstract
Processing long contexts has become a critical capability for modern large language models (LLMs). Existing works leverage agent-based divide-and-conquer methods for processing long contexts. But these methods face crucial limitations, including prohibitive accumulated latency and amplified information loss from excessive agent invocations, and the disruption of inherent textual dependencies by immoderate partitioning. In this paper, we propose a novel multi-agent framework XpandA (Expand-Agent) coupled with question-driven workflow and dynamic partitioning for robust long-context processing. XpandA overcomes these limitations through: 1) dynamic partitioning of long texts, which adaptively modulates the filling rate of context windows for input sequences of vastly varying lengths; 2) question-guided protocol to update flat information ensembles within centralized shared memory,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Geographic Information Systems Studies
MethodsByte Pair Encoding · Linear Layer · Attention Is All You Need · WordPiece · Multi-Head Attention · BART · Softmax · Layer Normalization · Adam · Linear Warmup With Linear Decay
