Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts

Jiahai Feng; Stuart Russell; Jacob Steinhardt

arXiv:2412.04614·cs.LG·May 23, 2025

Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts

Jiahai Feng, Stuart Russell, Jacob Steinhardt

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces extractive structures as a framework to understand how pretrained language models generalize facts after finetuning, revealing mechanisms of fact storage and inference that operate across different layers.

Contribution

The paper proposes the concept of extractive structures to explain how components in language models coordinate to enable fact-based generalization, supported by empirical evidence across multiple models.

Findings

01

Extractive structures are learned during pretraining when facts are encountered before their implications.

02

Transfer of extractive structures allows counterfactual reasoning about facts.

03

Fact learning occurs at both early and late layers, enabling different types of generalization.

Abstract

Pretrained language models (LMs) can generalize to implications of facts that they are finetuned on. For example, if finetuned on ``John Doe lives in Tokyo," LMs can correctly answer ``What language do the people in John Doe's city speak?'' with ``Japanese''. However, little is known about the mechanisms that enable this generalization or how they are learned during pretraining. We introduce extractive structures as a framework for describing how components in LMs (e.g., MLPs or attention heads) coordinate to enable this generalization. The structures consist of informative components that store training facts as weight changes, and upstream and downstream extractive components that query and process the stored information to produce the correct implication. We hypothesize that extractive structures are learned during pretraining when encountering implications of previously known facts.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiahai-feng/extractive-structures
pytorchOfficial

Videos

Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts· slideslive

Taxonomy

TopicsNeural Networks and Applications · Fuzzy Logic and Control Systems

MethodsSoftmax · Attention Is All You Need · LLaMA