ICLEval: Evaluating In-Context Learning Ability of Large Language Models

Wentong Chen; Yankai Lin; ZhenHao Zhou; HongYun Huang; Yantao Jia,; Zhao Cao; Ji-Rong Wen

arXiv:2406.14955·cs.CL·December 10, 2024

ICLEval: Evaluating In-Context Learning Ability of Large Language Models

Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia,, Zhao Cao, Ji-Rong Wen

PDF

Open Access 2 Repos

TL;DR

This paper introduces ICLEval, a new benchmark for evaluating the in-context learning abilities of large language models, revealing early development and stability of these abilities across models.

Contribution

The paper presents ICLEval, a comprehensive benchmark focusing on copying and rule learning, to assess ICL abilities in LLMs, highlighting their early development and independence from model size.

Findings

01

ICL ability is widespread across different LLMs.

02

Model size is not the only factor influencing ICL performance.

03

ICL abilities, especially copying, develop early and stabilize during pretraining.

Abstract

In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Evaluating the ICL ability of LLMs can enhance their utilization and deepen our understanding of how this ability is acquired at the training stage. However, existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability. In this work, we introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning. Through the ICLEval benchmark, we demonstrate that ICL ability is universally present in different LLMs, and model size is not the sole determinant of ICL efficacy. Surprisingly, we observe that ICL abilities, particularly copying, develop early in the pretraining process and stabilize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsFocus