Many-Shot CoT-ICL: Making In-Context Learning Truly Learn
Tsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung

TL;DR
This paper investigates the behavior of many-shot chain-of-thought in-context learning (CoT-ICL) in large language models, revealing setting-dependent effects and proposing a demonstration ordering method to improve reasoning performance.
Contribution
It uncovers key scaling behaviors of CoT-ICL across tasks and models, and introduces Curvilinear Demonstration Selection to enhance reasoning accuracy.
Findings
Increasing CoT demonstrations benefits reasoning-oriented models more.
Similarity-based retrieval is ineffective for reasoning tasks.
Ordered demonstrations improve performance, with up to 5.42% gain.
Abstract
In-context learning (ICL) adapts large language models (LLMs) to new tasks by conditioning on demonstrations in the prompt without parameter updates. With long-context models, many-shot ICL can use dozens to hundreds of examples and achieve performance comparable to fine-tuning, yet current understanding of its scaling behavior is largely derived from non-reasoning tasks. We study many-shot chain-of-thought in-context learning (CoT-ICL) for reasoning and show that standard many-shot rules do not transfer. Across non-reasoning and reasoning-oriented LLMs and across non-reasoning and reasoning tasks, we find: (i) a setting-dependent scaling effect, where increasing the number of CoT demonstrations is unstable for non-reasoning LLMs and benefits mainly reasoning-oriented LLMs; (ii) similarity-based retrieval helps on non-reasoning tasks but fails on reasoning, since semantic similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
