Needle in the Haystack for Memory Based Large Language Models

Elliot Nelson; Georgios Kollias; Payel Das; Subhajit Chaudhury; Soham; Dan

arXiv:2407.01437·cs.CL·July 15, 2024·5 cites

Needle in the Haystack for Memory Based Large Language Models

Elliot Nelson, Georgios Kollias, Payel Das, Subhajit Chaudhury, Soham, Dan

PDF

Open Access

TL;DR

This paper demonstrates that integrating a dynamic external memory with a language model enhances long-context fact retrieval, outperforming larger models without additional training or modified attention mechanisms.

Contribution

The study introduces Larimar, a novel language model architecture with external associative memory, enabling effective long-context recall without extra training or larger parameters.

Findings

01

Larimar effectively handles longer contexts than seen during training.

02

External memory improves fact retrieval accuracy in LLMs.

03

Larimar outperforms larger models on long-context tasks.

Abstract

Current large language models (LLMs) often perform poorly on simple fact retrieval tasks. Here we investigate if coupling a dynamically adaptable external memory to a LLM can alleviate this problem. For this purpose, we test Larimar, a recently proposed language model architecture which uses an external associative memory, on long-context recall tasks including passkey and needle-in-the-haystack tests. We demonstrate that the external memory of Larimar, which allows fast write and read of an episode of text samples, can be used at test time to handle contexts much longer than those seen during training. We further show that the latent readouts from the memory (to which long contexts are written) control the decoder towards generating correct outputs, with the memory stored off of the GPU. Compared to existing transformer-based LLM architectures for long-context recall tasks that use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsSoftmax · Attention Is All You Need