Deeper Insights Without Updates: The Power of In-Context Learning Over   Fine-Tuning

Qingyu Yin; Xuzheng He; Luoao Deng; Chak Tou Leong; Fan Wang; Yanzhao; Yan; Xiaoyu Shen; Qiang Zhang

arXiv:2410.04691·cs.LG·October 8, 2024

Deeper Insights Without Updates: The Power of In-Context Learning Over Fine-Tuning

Qingyu Yin, Xuzheng He, Luoao Deng, Chak Tou Leong, Fan Wang, Yanzhao, Yan, Xiaoyu Shen, Qiang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that in-context learning outperforms fine-tuning in capturing implicit patterns in tasks, especially with limited data, challenging the common belief that fine-tuning is always superior.

Contribution

The study provides empirical evidence and a mechanistic theory explaining why ICL better captures implicit patterns than fine-tuning in large language models.

Findings

01

ICL outperforms fine-tuning on implicit pattern tasks.

02

Models with ICL quickly grasp deep patterns, improving accuracy.

03

Fine-tuning shows limited gains despite more training data.

Abstract

Fine-tuning and in-context learning (ICL) are two prevalent methods in imbuing large language models with task-specific knowledge. It is commonly believed that fine-tuning can surpass ICL given sufficient training samples as it allows the model to adjust its internal parameters based on the data. However, this paper presents a counterintuitive finding: For tasks with implicit patterns, ICL captures these patterns significantly better than fine-tuning. We developed several datasets featuring implicit patterns, such as sequences determining answers through parity or identifying reducible terms in calculations. We then evaluated the models' understanding of these patterns under both fine-tuning and ICL across models ranging from 0.5B to 7B parameters. The results indicate that models employing ICL can quickly grasp deep patterns and significantly improve accuracy. In contrast, fine-tuning,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mikastars39/iclvsfinetune
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSeismology and Earthquake Studies