TL;DR
This paper investigates where and how memorisation occurs in neural language models across multiple NLP tasks, revealing that memorisation is gradual, task-dependent, and challenges the idea that deeper layers solely memorise.
Contribution
It expands the analysis of memorisation localisation to 12 NLP tasks and applies four techniques, showing that memorisation is gradual and task-dependent, refining existing hypotheses.
Findings
Memorisation occurs gradually across layers.
Memorisation is highly task-dependent.
Challenging the idea that deeper layers solely memorise.
Abstract
Memorisation is a natural part of learning from real-world data: neural models pick up on atypical input-output combinations and store those training examples in their parameter space. That this happens is well-known, but how and where are questions that remain largely unanswered. Given a multi-layered neural model, where does memorisation occur in the millions of parameters? Related work reports conflicting findings: a dominant hypothesis based on image classification is that lower layers learn generalisable features and that deeper layers specialise and memorise. Work from NLP suggests this does not apply to language models, but has been mainly focused on memorisation of facts. We expand the scope of the localisation question to 12 natural language classification tasks and apply 4 memorisation localisation techniques. Our results indicate that memorisation is a gradual process rather…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
