In-context Learning Generalizes, But Not Always Robustly: The Case of   Syntax

Aaron Mueller; Albert Webson; Jackson Petty; Tal Linzen

arXiv:2311.07811·cs.CL·April 11, 2024·1 cites

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax

Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates whether large language models truly understand syntax through in-context learning or rely on superficial heuristics, examining their robustness and the effects of chain-of-thought prompting across different models.

Contribution

It provides empirical evidence on the variability of in-context learning robustness across models and highlights the impact of pre-training data and prompting techniques.

Findings

01

Models pre-trained on code generalize better to syntax tasks.

02

Chain-of-thought prompting improves out-of-distribution generalization.

03

Variance in performance is more due to training data than model size.

Abstract

In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer the underlying structure of the task defined by the context, or do they rely on superficial heuristics that only generalize to identically distributed examples? We address this question using transformations tasks and an NLI task that assess sensitivity to syntax - a requirement for robust language understanding. We further investigate whether out-of-distribution generalization can be improved via chain-of-thought prompting, where the model is provided with a sequence of intermediate computation steps that illustrate how the task ought to be performed. In experiments with models from the GPT, PaLM, and Llama 2 families, we find large variance across LMs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aaronmueller/syntax-icl
pytorchOfficial

Videos

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Layer Normalization · Residual Connection · Byte Pair Encoding · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax