The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities
Irina Bigoulaeva, Harish Tayyar Madabushi, Iryna Gurevych

TL;DR
This paper investigates the capabilities of large language models, revealing that instruction tuning does not fundamentally change their abilities but rather constrains them within limits set by pretraining data, linking in-context learning and instruction tuning.
Contribution
The study demonstrates that instruction-tuned models' performance is closely linked to their base models' in-context abilities, highlighting the limits imposed by pretraining data on both.
Findings
Instruction tuning performance correlates with base model in-context performance.
Pretraining data limits the tasks that instruction-tuned models can solve.
Instruction tuning does not fundamentally alter the core capabilities of LLMs.
Abstract
Large Language Models (LLMs), trained on extensive web-scale corpora, have demonstrated remarkable abilities across diverse tasks, especially as they are scaled up. Nevertheless, even state-of-the-art models struggle in certain cases, sometimes failing at problems solvable by young children, indicating that traditional notions of task complexity are insufficient for explaining LLM capabilities. However, exploring LLM capabilities is complicated by the fact that most widely-used models are also "instruction-tuned" to respond appropriately to prompts. With the goal of disentangling the factors influencing LLM performance, we investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples. Through extensive experiments across various model families, scales and task types, which included instruction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Artificial Intelligence in Law
MethodsBalanced Selection
