Injecting structural hints: Using language models to study inductive   biases in language learning

Isabel Papadimitriou; Dan Jurafsky

arXiv:2304.13060·cs.CL·October 31, 2023·5 cites

Injecting structural hints: Using language models to study inductive biases in language learning

Isabel Papadimitriou, Dan Jurafsky

PDF

Open Access 1 Repo

TL;DR

This paper investigates how injecting different types of inductive biases into transformer language models affects their ability to learn typologically-diverse natural languages, providing insights into language learning mechanisms.

Contribution

It introduces a novel experimental framework using transformer models to test the impact of specific inductive biases on language learning, highlighting the importance of non-context-free relationships.

Findings

01

Non-context-free relationships are the most effective inductive biases.

02

Injecting hierarchical and complex token relationships improves language learning.

03

Transformer models can simulate controlled language learning experiments.

Abstract

Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologically-diverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in human language learning. We investigate the effect of injecting models with three types of inductive bias: 1) recursive, hierarchical processing, 2) crossing token-token relationships that can't be modeled by context-free grammars, and 3) a Zipfian power-law vocabulary distribution. We show that non-context-free relationships form the best inductive biases. Our study leverages the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

toizzy/injecting-structural-hints
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling