On the Noise Robustness of In-Context Learning for Text Generation

Hongfu Gao; Feipeng Zhang; Wenyu Jiang; Jun Shu; Feng Zheng; Hongxin; Wei

arXiv:2405.17264·cs.CL·October 25, 2024

On the Noise Robustness of In-Context Learning for Text Generation

Hongfu Gao, Feipeng Zhang, Wenyu Jiang, Jun Shu, Feng Zheng, Hongxin, Wei

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the impact of noisy demonstrations on in-context learning for text generation with large language models and proposes a Local Perplexity Ranking method to improve robustness against such noise, significantly enhancing performance.

Contribution

The paper introduces Local Perplexity Ranking (LPR), a novel method that filters noisy demonstrations by semantic similarity, improving in-context learning robustness for text generation tasks.

Findings

01

LPR improves EM scores by up to 18.75 on noisy benchmarks.

02

Noisy annotations significantly degrade in-context learning performance.

03

LPR effectively filters noisy demonstrations, maintaining effectiveness of selection methods.

Abstract

Large language models (LLMs) have shown impressive performance on downstream tasks by in-context learning (ICL), which heavily relies on the quality of demonstrations selected from a large set of annotated examples. Recent works claim that in-context learning is robust to noisy demonstrations in text classification. In this work, we show that, on text generation tasks, noisy annotations significantly hurt the performance of in-context learning. To circumvent the issue, we propose a simple and effective approach called Local Perplexity Ranking (LPR), which replaces the "noisy" candidates with their nearest neighbors that are more likely to be clean. Our method is motivated by analyzing the perplexity deviation caused by noisy labels and decomposing perplexity into inherent perplexity and matching perplexity. Our key idea behind LPR is thus to decouple the matching perplexity by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ml-stat-sustech/local-perplexity-ranking
pytorchOfficial

Videos

On the Noise Robustness of In-Context Learning for Text Generation· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsSparse Evolutionary Training