From Unstructured Data to In-Context Learning: Exploring What Tasks Can   Be Learned and When

Kevin Christian Wibisono; Yixin Wang

arXiv:2406.00131·cs.LG·November 12, 2024

From Unstructured Data to In-Context Learning: Exploring What Tasks Can Be Learned and When

Kevin Christian Wibisono, Yixin Wang

PDF

Open Access 1 Video

TL;DR

This paper investigates how large language models trained on unstructured text can perform in-context learning, revealing that many capabilities arise from simple co-occurrence patterns and highlighting the importance of data structure.

Contribution

It demonstrates that in-context learning can emerge from unstructured data through co-occurrence, clarifies when positional info is necessary, and identifies limitations in logic reasoning and analogy tasks.

Findings

01

ICL capabilities can arise from co-occurrence in unstructured data

02

Positional information is crucial for logic reasoning tasks

03

ICL fails when relevant pairs are fixed in training positions

Abstract

Large language models (LLMs) like transformers demonstrate impressive in-context learning (ICL) capabilities, allowing them to make predictions for new tasks based on prompt exemplars without parameter updates. While existing ICL theories often assume structured training data resembling ICL tasks (e.g., x-y pairs for linear regression), LLMs are typically trained unsupervised on unstructured text, such as web content, which lacks clear parallels to tasks like word analogy. To address this gap, we examine what enables ICL in models trained on unstructured data, focusing on critical sequence model requirements and training data structure. We find that many ICL capabilities can emerge simply from co-occurrence of semantically related word pairs in unstructured data; word analogy completion, for example, can provably arise purely through co-occurrence modeling, using classical language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

From Unstructured Data to In-Context Learning: Exploring What Tasks Can Be Learned and When· slideslive

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms

MethodsFocus