Examining the Inductive Bias of Neural Language Models with Artificial   Languages

Jennifer C. White; Ryan Cotterell

arXiv:2106.01044·cs.CL·June 3, 2021

Examining the Inductive Bias of Neural Language Models with Artificial Languages

Jennifer C. White, Ryan Cotterell

PDF

1 Repo

TL;DR

This paper introduces a novel method using artificial languages to systematically investigate the inductive biases of neural language models, revealing architecture-dependent preferences that differ from natural language tendencies.

Contribution

It proposes a controlled experimental framework with artificial languages to isolate and analyze the inductive biases of neural architectures like LSTMs and transformers.

Findings

01

LSTMs show little bias towards word order.

02

Transformers prefer certain word orderings.

03

Neither architecture's bias matches natural language tendencies.

Abstract

Since language models are used to model a wide variety of languages, it is natural to ask whether the neural architectures used for the task have inductive biases towards modeling particular types of languages. Investigation of these biases has proved complicated due to the many variables that appear in the experimental setup. Languages vary in many typological dimensions, and it is difficult to single out one or two to investigate without the others acting as confounders. We propose a novel method for investigating the inductive biases of language models using artificial languages. These languages are constructed to allow us to create parallel corpora across languages that differ only in the typological feature being investigated, such as word order. We then use them to train and test language models. This constitutes a fully controlled causal framework, and demonstrates how grammar…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rycolab/artificial-languages
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory