Source-Aware Training Enables Knowledge Attribution in Language Models

Muhammad Khalifa; David Wadden; Emma Strubell; Honglak Lee; Lu Wang,; Iz Beltagy; Hao Peng

arXiv:2404.01019·cs.CL·August 14, 2024·1 cites

Source-Aware Training Enables Knowledge Attribution in Language Models

Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang,, Iz Beltagy, Hao Peng

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces source-aware training for large language models, enabling them to cite their pretraining sources, which improves transparency and interpretability without significantly affecting performance.

Contribution

It proposes a novel training method that associates source identifiers with knowledge in LLMs, allowing for faithful source attribution during response generation.

Findings

01

Enables LLMs to cite pretraining sources accurately

02

Minimal impact on model perplexity compared to standard training

03

Highlights importance of data augmentation for attribution

Abstract

Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretraining source supporting a generated response. Intrinsic source citation can enhance LLM transparency, interpretability, and verifiability. To give LLMs such ability, we explore source-aware training -- a recipe that involves (i) training the LLM to associate unique source document identifiers with the knowledge in each document, followed by (ii) an instruction-tuning stage to teach the LLM to cite a supporting pretraining source when prompted. Source-aware training borrows from existing pretraining/fine-tuning frameworks and requires minimal changes to the model architecture or implementation. Through experiments on synthetic data, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mukhal/intrinsic-source-citation
jaxOfficial

Datasets

mkhalifa/BioCite
dataset· 20 dl
20 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques