When does In-context Learning Fall Short and Why? A Study on   Specification-Heavy Tasks

Hao Peng; Xiaozhi Wang; Jianhui Chen; Weikai Li; Yunjia Qi; Zimu Wang,; Zhili Wu; Kaisheng Zeng; Bin Xu; Lei Hou; Juanzi Li

arXiv:2311.08993·cs.CL·November 16, 2023·5 cites

When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks

Hao Peng, Xiaozhi Wang, Jianhui Chen, Weikai Li, Yunjia Qi, Zimu Wang,, Zhili Wu, Kaisheng Zeng, Bin Xu, Lei Hou, Juanzi Li

PDF

Open Access

TL;DR

This study investigates the limitations of in-context learning with large language models on complex, specification-heavy tasks, revealing key challenges and demonstrating that fine-tuning can significantly improve performance.

Contribution

The paper identifies specific reasons for ICL failure on complex tasks and shows that fine-tuning and instruction tuning can overcome these limitations.

Findings

01

ICL performs poorly on specification-heavy tasks, achieving less than half of state-of-the-art results.

02

Main failure reasons include poor context understanding, schema misalignment, and limited long-text comprehension.

03

Fine-tuning and instruction tuning significantly improve LLM performance on these tasks.

Abstract

In-context learning (ICL) has become the default method for using large language models (LLMs), making the exploration of its limitations and understanding the underlying causes crucial. In this paper, we find that ICL falls short of handling specification-heavy tasks, which are tasks with complicated and extensive task specifications, requiring several hours for ordinary humans to master, such as traditional information extraction tasks. The performance of ICL on these tasks mostly cannot reach half of the state-of-the-art results. To explore the reasons behind this failure, we conduct comprehensive experiments on 18 specification-heavy tasks with various LLMs and identify three primary reasons: inability to specifically understand context, misalignment in task schema comprehension with humans, and inadequate long-text understanding ability. Furthermore, we demonstrate that through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification