LongAgent: Scaling Language Models to 128k Context through Multi-Agent   Collaboration

Jun Zhao; Can Zu; Hao Xu; Yi Lu; Wei He; Yiwen Ding; Tao Gui; Qi; Zhang; Xuanjing Huang

arXiv:2402.11550·cs.CL·March 14, 2024·1 cites

LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi, Zhang, Xuanjing Huang

PDF

Open Access 1 Repo

TL;DR

LongAgent introduces a multi-agent collaboration framework that scales language models to handle 128k token contexts, improving long-text processing tasks like retrieval and multi-hop QA over existing models such as GPT-4.

Contribution

The paper presents LongAgent, a novel multi-agent approach that enables existing LLMs to process much longer contexts effectively, addressing hallucination issues through inter-agent communication.

Findings

01

LongAgent with LLaMA-7B outperforms GPT-4 in long-text retrieval.

02

The approach effectively reduces hallucinations via inter-member communication.

03

Significant improvements in multi-hop question answering tasks.

Abstract

Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks. However, LLMs with long context windows have been notorious for their expensive training costs and high inference latency. Even the most advanced models such as GPT-4 and Claude2 often make mistakes when processing inputs of over $100 k$ tokens, a phenomenon also known as \textit{lost in the middle}. In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. In \textsc{LongAgent}, a leader is responsible for understanding user intent and directing team members to acquire information from documents. Due to members' hallucinations, it is non-trivial for a leader to obtain accurate information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zuucan/needleinahaystack-plus
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Dropout · Residual Connection