Analyzing limits for in-context learning

Omar Naim; Jerome Bolte; Nicholas Asher

arXiv:2502.03503·stat.ML·November 10, 2025

Analyzing limits for in-context learning

Omar Naim, Jerome Bolte, Nicholas Asher

PDF

Open Access

TL;DR

This paper critically examines the capabilities of transformer models in in-context learning, providing empirical evidence and mathematical analysis that highlight their limitations in achieving general predictive accuracy.

Contribution

It challenges prior claims by demonstrating that transformers cannot fully implement standard learning algorithms due to architectural constraints.

Findings

01

Empirical evidence contradicts the idea that transformers learn standard algorithms.

02

Mathematical analysis shows inherent architectural limitations.

03

Transformers cannot attain universal predictive accuracy.

Abstract

Our paper challenges claims from prior research that transformer-based models, when learning in context, implicitly implement standard learning algorithms. We present empirical evidence inconsistent with this view and provide a mathematical analysis demonstrating that transformers cannot achieve general predictive accuracy due to inherent architectural limitations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSoftmax · Attention Is All You Need · Layer Normalization