Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls
Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Urmish Thakker, Changran Hu, Qizheng Zhang

TL;DR
This paper empirically investigates many-shot prompting for test-time adaptation of large language models, analyzing its effectiveness, limitations, and the impact of different strategies across various tasks and models.
Contribution
It provides a comprehensive analysis of many-shot prompting, exploring how update strategies and selection policies affect performance and highlighting practical limits of prompt-based adaptation.
Findings
Effective for structured tasks with high information gain
Highly sensitive to selection strategy and demonstration order
Limited benefits for open-ended generation tasks
Abstract
Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of in-context learning (ICL) examples are injected as an input-space test-time update. Although performance can improve as more demonstrations are added, the reliability and limits of this update mechanism remain poorly understood, particularly for open-source models. We present an empirical study of many-shot prompting across tasks and model backbones, analyzing how performance varies with update magnitude, example ordering, and selection policy. We further study Dynamic and Reinforced ICL as alternative test-time update strategies that control which information is injected and how it constrains model behavior. We find that many-shot prompting is effective for structured tasks where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Artificial Intelligence in Healthcare and Education
