Can In-context Learners Learn a Reasoning Concept from Demonstrations?
Michal \v{S}tef\'anik, Marek Kadl\v{c}\'ik

TL;DR
This paper investigates whether in-context language models can learn reasoning concepts from demonstrations by focusing on concept-sharing examples, revealing that most models struggle but T0 models show some sensitivity.
Contribution
The paper introduces a concept-sharing evaluation method to better assess in-context learning of reasoning concepts, highlighting limitations of current models and the relative sensitivity of T0 models.
Findings
Most models do not benefit from concept-sharing demonstrations.
T0 models show increased sensitivity to demonstrated concepts.
Evaluation method isolates models' ability to learn new reasoning concepts.
Abstract
Language models exhibit an emergent ability to learn a new task from a small number of input-output demonstrations. However, recent work shows that in-context learners largely rely on their pre-trained knowledge, such as the sentiment of the labels, instead of learning new associations from the input. We argue that the commonly-used few-shot evaluation using a random selection of in-context demonstrations can not disentangle models' reliance on such biases, as most of the randomly-selected demonstrations do not present relations informative for prediction beyond exposing the task's input-output distribution. Therefore, to evaluate models' in-context learning ability independent of models' memory, we introduce a Concept-sharing few-shot learning method choosing the demonstrations that share an underlying concept with the predicted sample. We extract a set of such concepts from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
