LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi, Zhang, Xuanjing Huang

TL;DR
LongAgent introduces a multi-agent collaboration framework that scales language models to handle 128k token contexts, improving long-text processing tasks like retrieval and multi-hop QA over existing models such as GPT-4.
Contribution
The paper presents LongAgent, a novel multi-agent approach that enables existing LLMs to process much longer contexts effectively, addressing hallucination issues through inter-agent communication.
Findings
LongAgent with LLaMA-7B outperforms GPT-4 in long-text retrieval.
The approach effectively reduces hallucinations via inter-member communication.
Significant improvements in multi-hop question answering tasks.
Abstract
Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks. However, LLMs with long context windows have been notorious for their expensive training costs and high inference latency. Even the most advanced models such as GPT-4 and Claude2 often make mistakes when processing inputs of over tokens, a phenomenon also known as \textit{lost in the middle}. In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. In \textsc{LongAgent}, a leader is responsible for understanding user intent and directing team members to acquire information from documents. Due to members' hallucinations, it is non-trivial for a leader to obtain accurate information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Dropout · Residual Connection
