Mission: Impossible Language Models

Julie Kallini; Isabel Papadimitriou; Richard Futrell; Kyle Mahowald,; Christopher Potts

arXiv:2401.06416·cs.CL·August 6, 2024·1 cites

Mission: Impossible Language Models

Julie Kallini, Isabel Papadimitriou, Richard Futrell, Kyle Mahowald,, Christopher Potts

PDF

Open Access 1 Repo 10 Models 2 Videos

TL;DR

This study tests GPT-2's ability to learn synthetic impossible languages, revealing limitations and challenging claims that LLMs can learn languages impossible for humans, thus informing linguistic and cognitive research.

Contribution

The paper introduces a systematic set of synthetic impossible languages and evaluates GPT-2's learning capacity across these, providing empirical evidence on LLMs' limitations.

Findings

01

GPT-2 struggles with impossible languages compared to English.

02

Learning impossible languages varies with language complexity.

03

Results challenge the claim that LLMs can learn all language types.

Abstract

Chomsky and others have very directly claimed that large language models (LLMs) are equally capable of learning languages that are possible and impossible for humans to learn. However, there is very little published experimental evidence to support such a claim. Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules. These languages lie on an impossibility continuum: at one end are languages that are inherently impossible, such as random and irreversible shuffles of English words, and on the other, languages that may not be intuitively impossible but are often considered so in linguistics, particularly those with rules based on counting word positions. We report on a wide range of evaluations to assess the capacity of GPT-2 small models to learn these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jkallini/mission-impossible-language-models
pytorchOfficial

Models

Videos

Mission: Impossible language models – Paper Explained [ACL 2024 recording]· youtube

Mission: Impossible Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsSparse Evolutionary Training · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Adam · Cosine Annealing · Dense Connections · Linear Warmup With Cosine Annealing