Are Pretrained Language Models Symbolic Reasoners Over Knowledge?

Nora Kassner; Benno Krojer; Hinrich Sch\"utze

arXiv:2006.10413·cs.CL·October 13, 2020

Are Pretrained Language Models Symbolic Reasoners Over Knowledge?

Nora Kassner, Benno Krojer, Hinrich Sch\"utze

PDF

1 Repo

TL;DR

This paper investigates how pretrained language models acquire factual knowledge, focusing on reasoning and memorization, revealing their strengths and limitations in applying reasoning rules and memorizing facts.

Contribution

It provides the first causal analysis of the relationship between training facts and learned knowledge in PLMs using synthetic data.

Findings

01

PLMs can apply some symbolic reasoning rules correctly

02

PLMs struggle with two-hop reasoning

03

Memorization depends on schema conformity and frequency

Abstract

How can pretrained language models (PLMs) learn factual knowledge from the training set? We investigate the two most important mechanisms: reasoning and memorization. Prior work has attempted to quantify the number of facts PLMs learn, but we present, using synthetic data, the first study that investigates the causal relation between facts present in training and facts learned by the PLM. For reasoning, we show that PLMs seem to learn to apply some symbolic reasoning rules correctly but struggle with others, including two-hop reasoning. Further analysis suggests that even the application of learned reasoning rules is flawed. For memorization, we identify schema conformity (facts systematically supported by other facts) and frequency as key factors for its success.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BennoKrojer/reasoning-over-facts
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.