Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs

Kanishka Misra; Kyle Mahowald

arXiv:2403.19827·cs.CL·June 26, 2025·3 cites

Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs

Kanishka Misra, Kyle Mahowald

PDF

Open Access 1 Video

TL;DR

This study demonstrates that language models can learn rare grammatical phenomena like the AANN construction through generalization from more common related constructions, rather than solely memorization.

Contribution

The paper provides evidence that transformer language models can acquire rare syntactic phenomena via generalization, challenging the view that such learning is mainly memorization.

Findings

01

AANNs are learned better than perturbed variants.

02

Learning is enhanced with increased input variability.

03

Generalization from related constructions explains rare phenomenon acquisition.

Abstract

Language models learn rare syntactic phenomena, but the extent to which this is attributable to generalization vs. memorization is a major open question. To that end, we iteratively trained transformer language models on systematically manipulated corpora which were human-scale in size, and then evaluated their learning of a rare grammatical phenomenon: the English Article+Adjective+Numeral+Noun (AANN) construction (``a beautiful five days''). We compared how well this construction was learned on the default corpus relative to a counterfactual corpus in which AANN sentences were removed. We found that AANNs were still learned better than systematically perturbed variants of the construction. Using additional counterfactual corpora, we suggest that this learning occurs through generalization from related constructions (e.g., ``a few days''). An additional experiment showed that this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques