SCOPE: A Self-supervised Framework for Improving Faithfulness in   Conditional Text Generation

Song Duong; Florian Le Bronnec; Alexandre Allauzen; Vincent Guigue,; Alberto Lumbreras; Laure Soulier; Patrick Gallinari

arXiv:2502.13674·cs.CL·February 20, 2025

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation

Song Duong, Florian Le Bronnec, Alexandre Allauzen, Vincent Guigue,, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

PDF

Open Access

TL;DR

This paper introduces SCOPE, a self-supervised framework that enhances faithfulness in conditional text generation by training models to prefer grounded outputs, significantly reducing hallucinations in tasks like summarization.

Contribution

SCOPE presents a novel self-supervised training method that generates unfaithful samples to improve faithfulness in conditional text generation models.

Findings

01

Outperforms existing self-supervised methods in faithfulness metrics

02

Reduces hallucinations in summarization and data-to-text tasks

03

Improves human and automatic evaluation scores for groundedness

Abstract

Large Language Models (LLMs), when used for conditional text generation, often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This issue arises in typical conditional text generation tasks, such as text summarization and data-to-text generation, where the goal is to produce fluent text based on contextual input. When fine-tuned on specific domains, LLMs struggle to provide faithful answers to a given context, often adding information or generating errors. One underlying cause of this issue is that LLMs rely on statistical patterns learned from their training data. This reliance can interfere with the model's ability to stay faithful to a provided context, leading to the generation of ungrounded information. We build upon this observation and introduce a novel self-supervised method for generating a training set of unfaithful samples.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedia, Religion, Digital Communication

MethodsSparse Evolutionary Training