The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of   Instruction Tuning and In-Context Learning Capabilities

Irina Bigoulaeva; Harish Tayyar Madabushi; Iryna Gurevych

arXiv:2501.08716·cs.CL·January 16, 2025

The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities

Irina Bigoulaeva, Harish Tayyar Madabushi, Iryna Gurevych

PDF

Open Access 1 Repo

TL;DR

This paper investigates the capabilities of large language models, revealing that instruction tuning does not fundamentally change their abilities but rather constrains them within limits set by pretraining data, linking in-context learning and instruction tuning.

Contribution

The study demonstrates that instruction-tuned models' performance is closely linked to their base models' in-context abilities, highlighting the limits imposed by pretraining data on both.

Findings

01

Instruction tuning performance correlates with base model in-context performance.

02

Pretraining data limits the tasks that instruction-tuned models can solve.

03

Instruction tuning does not fundamentally alter the core capabilities of LLMs.

Abstract

Large Language Models (LLMs), trained on extensive web-scale corpora, have demonstrated remarkable abilities across diverse tasks, especially as they are scaled up. Nevertheless, even state-of-the-art models struggle in certain cases, sometimes failing at problems solvable by young children, indicating that traditional notions of task complexity are insufficient for explaining LLM capabilities. However, exploring LLM capabilities is complicated by the fact that most widely-used models are also "instruction-tuned" to respond appropriately to prompts. With the goal of disentangling the factors influencing LLM performance, we investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples. Through extensive experiments across various model families, scales and task types, which included instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ukplab/arxiv2025-inherent-limits-plms
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Artificial Intelligence in Law

MethodsBalanced Selection