Injecting structural hints: Using language models to study inductive biases in language learning
Isabel Papadimitriou, Dan Jurafsky

TL;DR
This paper investigates how injecting different types of inductive biases into transformer language models affects their ability to learn typologically-diverse natural languages, providing insights into language learning mechanisms.
Contribution
It introduces a novel experimental framework using transformer models to test the impact of specific inductive biases on language learning, highlighting the importance of non-context-free relationships.
Findings
Non-context-free relationships are the most effective inductive biases.
Injecting hierarchical and complex token relationships improves language learning.
Transformer models can simulate controlled language learning experiments.
Abstract
Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologically-diverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in human language learning. We investigate the effect of injecting models with three types of inductive bias: 1) recursive, hierarchical processing, 2) crossing token-token relationships that can't be modeled by context-free grammars, and 3) a Zipfian power-law vocabulary distribution. We show that non-context-free relationships form the best inductive biases. Our study leverages the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
