ICLEval: Evaluating In-Context Learning Ability of Large Language Models
Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia,, Zhao Cao, Ji-Rong Wen

TL;DR
This paper introduces ICLEval, a new benchmark for evaluating the in-context learning abilities of large language models, revealing early development and stability of these abilities across models.
Contribution
The paper presents ICLEval, a comprehensive benchmark focusing on copying and rule learning, to assess ICL abilities in LLMs, highlighting their early development and independence from model size.
Findings
ICL ability is widespread across different LLMs.
Model size is not the only factor influencing ICL performance.
ICL abilities, especially copying, develop early and stabilize during pretraining.
Abstract
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Evaluating the ICL ability of LLMs can enhance their utilization and deepen our understanding of how this ability is acquired at the training stage. However, existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability. In this work, we introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning. Through the ICLEval benchmark, we demonstrate that ICL ability is universally present in different LLMs, and model size is not the sole determinant of ICL efficacy. Surprisingly, we observe that ICL abilities, particularly copying, develop early in the pretraining process and stabilize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsFocus
