Transformers Learn Robust In-Context Regression under Distributional Uncertainty

Hoang T. H. Cao; Hai D. V. Trinh; Tho Quan; Lan V. Truong

arXiv:2603.18564·cs.LG·March 20, 2026

Transformers Learn Robust In-Context Regression under Distributional Uncertainty

Hoang T. H. Cao, Hai D. V. Trinh, Tho Quan, Lan V. Truong

PDF

Open Access

TL;DR

This paper investigates whether Transformers can perform robust in-context linear regression under realistic distributional uncertainties, including non-Gaussian and dependent data, and finds they often outperform classical methods.

Contribution

It demonstrates that Transformers can effectively learn in-context regression under broad distributional shifts, extending beyond traditional assumptions.

Findings

01

Transformers match or outperform classical estimators under various distributional shifts.

02

Robust in-context learning is achievable with Transformers even with non-Gaussian and dependent data.

03

Transformers demonstrate adaptability beyond classical maximum-likelihood-based methods.

Abstract

Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning