Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Shubhangi Upasani; Chen Wu; Jay Rainton; Bo Li; Urmish Thakker; Changran Hu; Qizheng Zhang

arXiv:2603.05829·cs.LG·March 18, 2026

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Urmish Thakker, Changran Hu, Qizheng Zhang

PDF

Open Access

TL;DR

This paper empirically investigates many-shot prompting for test-time adaptation of large language models, analyzing its effectiveness, limitations, and the impact of different strategies across various tasks and models.

Contribution

It provides a comprehensive analysis of many-shot prompting, exploring how update strategies and selection policies affect performance and highlighting practical limits of prompt-based adaptation.

Findings

01

Effective for structured tasks with high information gain

02

Highly sensitive to selection strategy and demonstration order

03

Limited benefits for open-ended generation tasks

Abstract

Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of in-context learning (ICL) examples are injected as an input-space test-time update. Although performance can improve as more demonstrations are added, the reliability and limits of this update mechanism remain poorly understood, particularly for open-source models. We present an empirical study of many-shot prompting across tasks and model backbones, analyzing how performance varies with update magnitude, example ordering, and selection policy. We further study Dynamic and Reinforced ICL as alternative test-time update strategies that control which information is injected and how it constrains model behavior. We find that many-shot prompting is effective for structured tasks where…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Artificial Intelligence in Healthcare and Education