How Abstract Is Linguistic Generalization in Large Language Models?   Experiments with Argument Structure

Michael Wilson; Jackson Petty; Robert Frank

arXiv:2311.04900·cs.CL·November 9, 2023·2 cites

How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

Michael Wilson, Jackson Petty, Robert Frank

PDF

Open Access 1 Repo

TL;DR

This paper investigates how well large language models understand and generalize argument structures across different contexts, revealing their strengths in familiar contexts and limitations with abstract generalizations, highlighting areas for improvement.

Contribution

The study provides empirical evidence on the extent of linguistic generalization in LLMs, especially regarding argument structure, and identifies their bias towards linear order in unobserved contexts.

Findings

01

LLMs generalize well within seen contexts using semantic structure.

02

LLMs struggle with abstract generalizations in unseen contexts.

03

Models show a bias towards linear order in unobserved structural generalizations.

Abstract

Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained Transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argument structure. We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e.g., the active object and passive subject of the verb spray), succeeding by making use of the semantically-organized structure of the embedding space for word embeddings. However, LLMs fail at generalizations between related contexts that have not been observed during pre-training, but which instantiate more abstract, but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clay-lab/structural-alternations
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Language and cultural evolution