MemLong: Memory-Augmented Retrieval for Long Text Modeling

Weijie Liu; Zecheng Tang; Juntao Li; Kehai Chen; Min Zhang

arXiv:2408.16967·cs.CL·September 2, 2024

MemLong: Memory-Augmented Retrieval for Long Text Modeling

Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang

PDF

Open Access 1 Repo

TL;DR

MemLong introduces a memory-augmented retrieval method that significantly improves long text modeling in language models, enabling context lengths up to 80k tokens while outperforming state-of-the-art models.

Contribution

The paper presents a novel retrieval-augmented approach combining a non-differentiable memory module with a trainable decoder to handle extremely long contexts efficiently.

Findings

01

Outperforms existing state-of-the-art long-context models

02

Extends context length from 4k to 80k tokens on a single GPU

03

Demonstrates superior performance on multiple benchmarks

Abstract

Recent advancements in Large Language Models (LLMs) have yielded remarkable success across diverse fields. However, handling long contexts remains a significant challenge for LLMs due to the quadratic time and space complexity of attention mechanisms and the growing memory consumption of the key-value cache during generation. This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context language modeling by utilizing an external retriever for historical information retrieval. MemLong combines a non-differentiable ``ret-mem'' module with a partially trainable decoder-only language model and introduces a fine-grained, controllable retrieval attention mechanism that leverages semantic-level relevant chunks. Comprehensive evaluations on multiple long-context language modeling benchmarks demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bui1dmysea/memlong
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies

MethodsSoftmax · Attention Is All You Need