Forgetting Curve: A Reliable Method for Evaluating Memorization   Capability for Long-context Models

Xinyu Liu; Runsong Zhao; Pengcheng Huang; Chunyang Xiao; Bei Li,; Jingang Wang; Tong Xiao; Jingbo Zhu

arXiv:2410.04727·cs.CL·October 8, 2024

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models

Xinyu Liu, Runsong Zhao, Pengcheng Huang, Chunyang Xiao, Bei Li,, Jingang Wang, Tong Xiao, Jingbo Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces the forgetting curve, a new robust method for evaluating the memorization capacity of long-context models, addressing limitations of previous metrics and applicable across various architectures.

Contribution

It proposes the forgetting curve as a novel evaluation method that is robust, model-agnostic, and improves understanding of long-context model memorization capabilities.

Findings

01

Transformers show effective extension techniques.

02

RNN/SSM models have limited effective length.

03

Forgetting curve differs from existing benchmarks.

Abstract

Numerous recent works target to extend effective context length for language models and various methods, tasks and benchmarks exist to measure model's effective memorization length. However, through thorough investigations, we find limitations for currently existing evaluations on model's memorization capability. We provide an extensive survey for limitations in this work and propose a new method called forgetting curve to measure the memorization capability of long-context models. We show that forgetting curve has the advantage of being robust to the tested corpus and the experimental settings, of not relying on prompts and can be applied to any model size. We apply our forgetting curve to a large variety of models involving both transformer and RNN/SSM based architectures. Our measurement provides empirical evidence for the effectiveness of transformer extension techniques while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

1azybug/forgettingcurve
pytorchOfficial

Videos

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-Context Models· underline

Taxonomy

TopicsSemantic Web and Ontologies