What Kind of Language is Easy to Language-Model Under Curriculum Learning?
Nadine El-Naggar, Tatsuki Kuribayashi, Ted Briscoe

TL;DR
This paper investigates how curriculum learning influences the inductive biases of language models, revealing that starting with simpler sentences significantly affects their learning patterns and typological predictions.
Contribution
It introduces a simple curriculum learning variant to study its effect on language model biases and typological pattern reproduction.
Findings
Curriculum learning substantially impacts language model biases.
Starting with simpler sentences alters the typological patterns learned.
The study expands understanding of learning scenarios in language modeling.
Abstract
Many of the thousands of attested languages share common configurations of features, creating a spectrum from typologically very rare (e.g., object-verb-subject word order) or impossible languages to very common combinations of features (e.g., subject-object-verb word order). One central question is under what conditions such typological tendencies can be predicted, and specifically whether the learning bias of language models (LMs) is sufficient to reproduce such patterns. In this study, we add one dimensionality to such analysis -- the learning scenario for LMs -- to explore its interaction with the inductive bias of LMs. Specifically, as a first study, we examine the effect of curriculum learning (CL), as a developmentally motivated learning scenario, i.e., starting with simpler sentences rather than randomly-ordered input. We expand existing LM-based exploration (El-Naggar et al.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
