SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Yung-Sung Chuang; Benjamin Cohen-Wang; Shannon Zejiang Shen; Zhaofeng Wu; Hu Xu; Xi Victoria Lin; James Glass; Shang-Wen Li; Wen-tau Yih

arXiv:2502.09604·cs.CL·June 17, 2025

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James Glass, Shang-Wen Li, Wen-tau Yih

PDF

Open Access 1 Repo 2 Models 1 Video

TL;DR

SelfCite is a self-supervised method that improves citation quality in large language models by aligning them to generate accurate, fine-grained citations using a novel context ablation reward signal, enhancing performance on long-form QA tasks.

Contribution

It introduces SelfCite, a self-supervised alignment technique that leverages context ablation as a reward signal to improve citation accuracy in LLMs, reducing reliance on manual annotations.

Findings

01

Increases citation F1 by up to 5.3 points on LongBench-Cite

02

Improves citation quality in long-form question answering

03

Enables better inference-time sampling and fine-tuning for citations

Abstract

We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

voidism/selfcite
pytorchOfficial

Models

Videos

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Recommender Systems and Techniques