From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang, Pengda Wang, Luke D. Plonsky, Frederick L. Oswald, and, Hanjie Chen

TL;DR
This paper evaluates language models through a human language acquisition lens, revealing that despite performance improvements, their developmental trajectory differs from humans and is heavily influenced by training data features.
Contribution
It introduces a three-stage framework inspired by human language development to assess LMs and analyzes how training data influences their linguistic capabilities.
Findings
Recent LMs outperform earlier models in overall performance.
LM development does not strictly follow human language acquisition stages.
Linguistic features of training data significantly impact model abilities.
Abstract
We examine the language capabilities of language models (LMs) from the critical perspective of human language acquisition. Building on classical language development theories, we propose a three-stage framework to assess the abilities of LMs, ranging from preliminary word understanding to complex grammar and complex logical reasoning. Using this framework, we evaluate the generative capacities of LMs using methods from linguistic research. Results indicate that although recent LMs outperform earlier models in overall performance, their developmental trajectory does not strictly follow the path of human language acquisition. Notably, in generation tasks, LMs are more similar to human performance in areas where information is easier to extract from the corpus, such as average word length, clauses, and auxiliary verbs. Newer LMs did not exhibit significant progress in terms of specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution
