Locating and Editing Factual Associations in GPT

Kevin Meng; David Bau; Alex Andonian; Yonatan Belinkov

arXiv:2202.05262·cs.CL·January 16, 2023·174 cites

Locating and Editing Factual Associations in GPT

Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov

PDF

Open Access 4 Repos 2 Datasets 2 Videos

TL;DR

This paper investigates how factual knowledge is stored in GPT models, identifies specific neural mechanisms responsible, and introduces a method to directly edit these associations effectively, improving model reliability.

Contribution

It reveals that factual associations are stored in localized, editable computations within middle-layer modules and proposes ROME, a new method for precise model editing.

Findings

01

ROME effectively updates factual associations in GPT models

02

Mid-layer feed-forward modules are key to storing factual knowledge

03

ROME maintains specificity and generalization on counterfactual assertions

Abstract

We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model's factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate factual predictions while processing subject tokens. To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME). We find that ROME is effective on a standard zero-shot relation extraction (zsRE) model-editing task, comparable to existing methods. To perform a more sensitive evaluation, we also evaluate ROME on a new dataset of counterfactual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)· youtube

Locating and Editing Factual Associations in GPT· slideslive

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Text Readability and Simplification

MethodsRank-One Model Editing