Kallini et al. (2024) do not compare impossible languages with constituency-based ones
Tim Hunter

TL;DR
This paper critiques a study on language models learning synthetic languages, highlighting a confound in their comparison and proposing methods for more accurate testing of models' ability to distinguish possible from impossible human languages.
Contribution
It identifies a confound in previous research comparing language models' learning of different synthetic languages and suggests improved experimental approaches.
Findings
Previous study found asymmetries in LLMs' learning success
The comparison had a confound affecting conclusions
Proposes methods for better testing of language model capabilities
Abstract
A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Multi-Head Attention · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing · Adam · Attention Dropout
