In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen

TL;DR
This paper investigates whether large language models truly understand syntax through in-context learning or rely on superficial heuristics, examining their robustness and the effects of chain-of-thought prompting across different models.
Contribution
It provides empirical evidence on the variability of in-context learning robustness across models and highlights the impact of pre-training data and prompting techniques.
Findings
Models pre-trained on code generalize better to syntax tasks.
Chain-of-thought prompting improves out-of-distribution generalization.
Variance in performance is more due to training data than model size.
Abstract
In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer the underlying structure of the task defined by the context, or do they rely on superficial heuristics that only generalize to identically distributed examples? We address this question using transformations tasks and an NLI task that assess sensitivity to syntax - a requirement for robust language understanding. We further investigate whether out-of-distribution generalization can be improved via chain-of-thought prompting, where the model is provided with a sequence of intermediate computation steps that illustrate how the task ought to be performed. In experiments with models from the GPT, PaLM, and Llama 2 families, we find large variance across LMs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Layer Normalization · Residual Connection · Byte Pair Encoding · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax
