Mass-Editing Memory in a Transformer
Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David, Bau

TL;DR
This paper introduces MEMIT, a scalable method for updating large language models with thousands of new memories simultaneously, significantly surpassing previous approaches in capacity and efficiency.
Contribution
The paper presents MEMIT, a novel technique enabling large language models to be updated with numerous memories at once, addressing limitations of prior single-association updates.
Findings
Scales to thousands of associations in GPT-J and GPT-NeoX.
Outperforms prior methods by orders of magnitude.
Demonstrates effective memory updating in large models.
Abstract
Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge. However, this line of work is predominantly limited to updating single associations. We develop MEMIT, a method for directly updating a language model with many memories, demonstrating experimentally that it can scale up to thousands of associations for GPT-J (6B) and GPT-NeoX (20B), exceeding prior work by orders of magnitude. Our code and data are at https://memit.baulab.info.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Advanced Data Storage Technologies · Scientific Computing and Data Management
MethodsGPT-NeoX
