Dual Debiasing for Noisy In-Context Learning for Text Generation

Siqi Liang; Sumyeong Ahn; Paramveer S. Dhillon; Jiayu Zhou

arXiv:2506.00418·cs.CL·June 24, 2025

Dual Debiasing for Noisy In-Context Learning for Text Generation

Siqi Liang, Sumyeong Ahn, Paramveer S. Dhillon, Jiayu Zhou

PDF

TL;DR

This paper introduces a dual debiasing framework that improves noise detection in in-context learning for text generation, making it more robust to high noise levels and enhancing overall performance.

Contribution

The paper proposes a novel dual debiasing method that corrects perplexity biases using synthesized neighbors, enabling accurate sample cleanliness assessment under noisy annotations.

Findings

01

Outperforms existing noise detection methods in high-noise scenarios

02

Achieves comparable performance to clean demonstration sets in ICL tasks

03

Remains robust even with extremely high noise ratios

Abstract

In context learning (ICL) relies heavily on high quality demonstrations drawn from large annotated corpora. Existing approaches detect noisy annotations by ranking local perplexities, presuming that noisy samples yield higher perplexities than their clean counterparts. However, this assumption breaks down when the noise ratio is high and many demonstrations are flawed. We reexamine the perplexity based paradigm for text generation under noisy annotations, highlighting two sources of bias in perplexity: the annotation itself and the domain specific knowledge inherent in large language models (LLMs). To overcome these biases, we introduce a dual debiasing framework that uses synthesized neighbors to explicitly correct perplexity estimates, yielding a robust Sample Cleanliness Score. This metric uncovers absolute sample cleanliness regardless of the overall corpus noise level. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.