Mass-Editing Memory in a Transformer

Kevin Meng; Arnab Sen Sharma; Alex Andonian; Yonatan Belinkov; David; Bau

arXiv:2210.07229·cs.CL·August 3, 2023·52 cites

Mass-Editing Memory in a Transformer

Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David, Bau

PDF

Open Access 2 Repos 1 Datasets 1 Video

TL;DR

This paper introduces MEMIT, a scalable method for updating large language models with thousands of new memories simultaneously, significantly surpassing previous approaches in capacity and efficiency.

Contribution

The paper presents MEMIT, a novel technique enabling large language models to be updated with numerous memories at once, addressing limitations of prior single-association updates.

Findings

01

Scales to thousands of associations in GPT-J and GPT-NeoX.

02

Outperforms prior methods by orders of magnitude.

03

Demonstrates effective memory updating in large models.

Abstract

Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge. However, this line of work is predominantly limited to updating single associations. We develop MEMIT, a method for directly updating a language model with many memories, demonstrating experimentally that it can scale up to thousands of associations for GPT-J (6B) and GPT-NeoX (20B), exceeding prior work by orders of magnitude. Our code and data are at https://memit.baulab.info.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Polyglot-or-Not/Fact-Completion
dataset· 895 dl
895 dl

Videos

Mass-Editing Memory in a Transformer· slideslive

Taxonomy

TopicsTopic Modeling · Advanced Data Storage Technologies · Scientific Computing and Data Management

MethodsGPT-NeoX