The Inductive Bias of In-Context Learning: Rethinking Pretraining   Example Design

Yoav Levine; Noam Wies; Daniel Jannai; Dan Navon; Yedid Hoshen; Amnon; Shashua

arXiv:2110.04541·cs.CL·March 22, 2022·1 cites

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon, Shashua

PDF

Open Access 1 Video

TL;DR

This paper reveals a bias in neural language model pretraining caused by contiguous text chunking, which affects dependency modeling, and proposes a new example design called "kNN-Pretraining" to enhance model capabilities.

Contribution

It formalizes the bias introduced by contiguous chunking in pretraining and introduces "kNN-Pretraining" to improve language understanding and question answering.

Findings

01

Pretraining bias favors dependencies within the same example.

02

Including semantically related sentences improves representations.

03

Proposed scheme enhances question answering abilities.

Abstract

Pretraining Neural Language Models (NLMs) over a large corpus involves chunking the text into training examples, which are contiguous text segments of sizes processable by the neural architecture. We highlight a bias introduced by this common practice: we prove that the pretrained NLM can model much stronger dependencies between text segments that appeared in the same training example, than it can between text segments that appeared in different training examples. This intuitive result has a twofold role. First, it formalizes the motivation behind a broad line of recent successful NLM training heuristics, proposed for the pretraining and fine-tuning stages, which do not necessarily appear related at first glance. Second, our result clearly indicates further improvements to be made in NLM pretraining for the benefit of Natural Language Understanding tasks. As an example, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications