Contrastive Error Attribution for Finetuned Language Models

Faisal Ladhak; Esin Durmus; Tatsunori Hashimoto

arXiv:2212.10722·cs.CL·July 12, 2023

Contrastive Error Attribution for Finetuned Language Models

Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto

PDF

Open Access 1 Repo

TL;DR

This paper introduces a contrastive error attribution method to identify and remove low-quality training data, significantly reducing hallucinations and errors in language models' outputs.

Contribution

It proposes a novel contrast-based error tracing technique that outperforms existing methods in detecting data errors affecting model faithfulness.

Findings

01

Achieves 0.93 mean average precision in error detection

02

Reduces entity hallucinations by 70% on NYT dataset

03

Reduces semantic errors by 55% on E2E dataset

Abstract

Recent work has identified noisy and misannotated data as a core cause of hallucinations and unfaithful outputs in Natural Language Generation (NLG) tasks. Consequently, identifying and removing these examples is a key open challenge in creating reliable NLG systems. In this work, we introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs, such as faithfulness errors in text summarization. We show that existing approaches for error tracing, such as gradient-based influence measures, do not perform reliably for detecting faithfulness errors in NLG datasets. We overcome the drawbacks of existing error tracing methods through a new, contrast-based estimate that compares undesired generations to human-corrected outputs. Our proposed method can achieve a mean average precision of 0.93 at detecting known data errors across synthetic tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fladhak/contrastive_error_attribution
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification