Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Xinyu Liu, Runsong Zhao, Pengcheng Huang, Chunyang Xiao, Bei Li,, Jingang Wang, Tong Xiao, Jingbo Zhu

TL;DR
This paper introduces the forgetting curve, a new robust method for evaluating the memorization capacity of long-context models, addressing limitations of previous metrics and applicable across various architectures.
Contribution
It proposes the forgetting curve as a novel evaluation method that is robust, model-agnostic, and improves understanding of long-context model memorization capabilities.
Findings
Transformers show effective extension techniques.
RNN/SSM models have limited effective length.
Forgetting curve differs from existing benchmarks.
Abstract
Numerous recent works target to extend effective context length for language models and various methods, tasks and benchmarks exist to measure model's effective memorization length. However, through thorough investigations, we find limitations for currently existing evaluations on model's memorization capability. We provide an extensive survey for limitations in this work and propose a new method called forgetting curve to measure the memorization capability of long-context models. We show that forgetting curve has the advantage of being robust to the tested corpus and the experimental settings, of not relying on prompts and can be applied to any model size. We apply our forgetting curve to a large variety of models involving both transformer and RNN/SSM based architectures. Our measurement provides empirical evidence for the effectiveness of transformer extension techniques while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies
