SirLLM: Streaming Infinite Retentive LLM

Yao Yao; Zuchao Li; Hai Zhao

arXiv:2405.12528·cs.CL·May 22, 2024

SirLLM: Streaming Infinite Retentive LLM

Yao Yao, Zuchao Li, Hai Zhao

PDF

Open Access 1 Repo

TL;DR

SirLLM introduces a novel streaming LLM approach that maintains long-term memory during infinite-length dialogues without fine-tuning, using token entropy and memory decay to filter key information.

Contribution

The paper presents SirLLM, a new method enabling LLMs to retain long-term memory in streaming inputs through a token entropy-based filtering and decay mechanism, without additional training.

Findings

01

Significant improvements in long-term memory retention across various tasks.

02

Effective handling of infinite-length dialogues without fine-tuning.

03

Robust performance demonstrated on multiple datasets.

Abstract

As Large Language Models (LLMs) become increasingly prevalent in various domains, their ability to process inputs of any length and maintain a degree of memory becomes essential. However, the one-off input of overly long texts is limited, as studies have shown that when input lengths exceed the LLMs' pre-trained text length, there is a dramatic decline in text generation capabilities. Moreover, simply extending the length of pre-training texts is impractical due to the difficulty in obtaining long text data and the substantial memory consumption costs this would entail for LLMs. Recent efforts have employed streaming inputs to alleviate the pressure of excessively long text inputs, but this approach can significantly impair the model's long-term memory capabilities. Motivated by this challenge, we introduce Streaming Infinite Retentive LLM (SirLLM), which allows LLMs to maintain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zoeyyao27/sirllm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Natural Language Processing Techniques · Handwritten Text Recognition Techniques