Recognition, recall, and retention of few-shot memories in large language models
A. Emin Orhan

TL;DR
This paper investigates how large language models remember, recognize, and retain few-shot examples during training, revealing rapid learning, quick memory overwriting, and differences between recognition and recall.
Contribution
It provides empirical insights into the recognition, recall, and retention capabilities of large language models for few-shot memories, highlighting their rapid learning and memory dynamics.
Findings
Single exposure often suffices for near-perfect recognition.
Recall requires more exposures but can be achieved in just 3.
Memory for original examples declines quickly but some persists after many updates.
Abstract
The training of modern large language models (LLMs) takes place in a regime where most training examples are seen only a few times by the model during the course of training. What does a model remember about such examples seen only a few times during training and how long does that memory persist in the face of continuous training with new examples? Here, we investigate these questions through simple recognition, recall, and retention experiments with LLMs. In recognition experiments, we ask if the model can distinguish the seen example from a novel example; in recall experiments, we ask if the model can correctly recall the seen example when cued by a part of it; and in retention experiments, we periodically probe the model's memory for the original examples as the model is trained continuously with new examples. We find that a single exposure is generally sufficient for a model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
MethodsFLIP
