SirLLM: Streaming Infinite Retentive LLM
Yao Yao, Zuchao Li, Hai Zhao

TL;DR
SirLLM introduces a novel streaming LLM approach that maintains long-term memory during infinite-length dialogues without fine-tuning, using token entropy and memory decay to filter key information.
Contribution
The paper presents SirLLM, a new method enabling LLMs to retain long-term memory in streaming inputs through a token entropy-based filtering and decay mechanism, without additional training.
Findings
Significant improvements in long-term memory retention across various tasks.
Effective handling of infinite-length dialogues without fine-tuning.
Robust performance demonstrated on multiple datasets.
Abstract
As Large Language Models (LLMs) become increasingly prevalent in various domains, their ability to process inputs of any length and maintain a degree of memory becomes essential. However, the one-off input of overly long texts is limited, as studies have shown that when input lengths exceed the LLMs' pre-trained text length, there is a dramatic decline in text generation capabilities. Moreover, simply extending the length of pre-training texts is impractical due to the difficulty in obtaining long text data and the substantial memory consumption costs this would entail for LLMs. Recent efforts have employed streaming inputs to alleviate the pressure of excessively long text inputs, but this approach can significantly impair the model's long-term memory capabilities. Motivated by this challenge, we introduce Streaming Infinite Retentive LLM (SirLLM), which allows LLMs to maintain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Natural Language Processing Techniques · Handwritten Text Recognition Techniques
