SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James Glass, Shang-Wen Li, Wen-tau Yih

TL;DR
SelfCite is a self-supervised method that improves citation quality in large language models by aligning them to generate accurate, fine-grained citations using a novel context ablation reward signal, enhancing performance on long-form QA tasks.
Contribution
It introduces SelfCite, a self-supervised alignment technique that leverages context ablation as a reward signal to improve citation accuracy in LLMs, reducing reliance on manual annotations.
Findings
Increases citation F1 by up to 5.3 points on LongBench-Cite
Improves citation quality in long-form question answering
Enables better inference-time sampling and fine-tuning for citations
Abstract
We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Recommender Systems and Techniques
