Implicit Representations of Meaning in Neural Language Models

Belinda Z. Li; Maxwell Nye; Jacob Andreas

arXiv:2106.00737·cs.CL·June 3, 2021

Implicit Representations of Meaning in Neural Language Models

Belinda Z. Li, Maxwell Nye, Jacob Andreas

PDF

1 Repo

TL;DR

This paper investigates whether neural language models like BART and T5 encode dynamic, entity-based representations of meaning that support reasoning about the world, beyond surface-level word statistics.

Contribution

It demonstrates that pretrained models develop implicit, manipulable representations of entities and situations, akin to dynamic semantics, learned solely from text data.

Findings

01

Neural representations support property and relation readouts for entities.

02

Manipulating these representations affects language generation predictably.

03

Models encode dynamic, entity-based meaning representations.

Abstract

Does the effectiveness of neural language models derive entirely from accurate modeling of surface word co-occurrence statistics, or do these models represent and reason about the world they describe? In BART and T5 transformer language models, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity's current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained neural language models is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Code and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

belindal/state-probes
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Linear Layer · Dropout · Byte Pair Encoding · Attention Is All You Need · Adam · Inverse Square Root Schedule · Layer Normalization