Co-occurrence is not Factual Association in Language Models

Xiao Zhang; Miao Li; Ji Wu

arXiv:2409.14057·cs.CL·June 17, 2025

Co-occurrence is not Factual Association in Language Models

Xiao Zhang, Miao Li, Ji Wu

PDF

Open Access 1 Repo

TL;DR

This paper reveals that language models primarily learn co-occurrence patterns rather than true factual associations, and proposes strategies to enhance factual knowledge learning and generalization in these models.

Contribution

It identifies the layer-specific encoding of knowledge in language models and introduces methods to promote learning of factual associations over co-occurrence biases.

Findings

01

Training on implicit factual associations improves generalization.

02

Forgetting co-occurrence statistics enhances factual learning.

03

Strategies improve reasoning performance on synthetic and real data.

Abstract

Pretrained language models can encode a large amount of knowledge and utilize it for various reasoning tasks, yet they can still struggle to learn novel factual knowledge effectively from finetuning on limited textual demonstrations. In this work, we show that the reason for this deficiency is that language models are biased to learn word co-occurrence statistics instead of true factual associations. We identify the differences between two forms of knowledge representation in language models: knowledge in the form of co-occurrence statistics is encoded in the middle layers of the transformer model and does not generalize well to reasoning scenarios beyond simple question answering, while true factual associations are encoded in the lower layers and can be freely utilized in various reasoning tasks. Based on these observations, we propose two strategies to improve the learning of factual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amounts-tidings/fact_learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques