The Alchemy of Thought: Understanding In-Context Learning Through Supervised Classification
Harshita Narnoli, Mihai Surdeanu

TL;DR
This paper investigates how in-context learning (ICL) in large language models (LLMs) relates to supervised classifiers, revealing that ICL behaves more like k-nearest neighbors when demonstrations are relevant, but outperforms classifiers when relevance is low.
Contribution
It provides empirical evidence comparing ICL behavior with gradient descent and kNN classifiers, clarifying when LLMs resemble these models and when they outperform them.
Findings
ICL behaves similarly to kNN with high relevance demonstrations.
LLMs outperform classifiers when demonstration relevance is low.
Attention mechanisms in LLMs are more akin to kNN than gradient descent.
Abstract
In-context learning (ICL) has become a prominent paradigm to rapidly customize LLMs to new tasks without fine-tuning. However, despite the empirical evidence of its usefulness, we still do not truly understand how ICL works. In this paper, we compare the behavior of in-context learning with supervised classifiers trained on ICL demonstrations to investigate three research questions: (1) Do LLMs with ICL behave similarly to classifiers trained on the same examples? (2) If so, which classifiers are closer, those based on gradient descent (GD) or those based on k-nearest neighbors (kNN)? (3) When they do not behave similarly, what conditions are associated with differences in behavior? Using text classification as a use case, with six datasets and three LLMs, we observe that LLMs behave similarly to these classifiers when the relevance of demonstrations is high. On average, ICL is closer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Child and Animal Learning Development · Text and Document Classification Technologies
