Cited Text Spans for Citation Text Generation

Xiangci Li; Yi-Hui Lee; Jessica Ouyang

arXiv:2309.06365·cs.CL·May 19, 2025

Cited Text Spans for Citation Text Generation

Xiangci Li, Yi-Hui Lee, Jessica Ouyang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a citation text generation method that conditions on cited text spans instead of abstracts, using distant labeling and keyword retrieval to improve factual grounding and practicality.

Contribution

It proposes a novel approach to citation generation by focusing on cited text spans and introduces methods for automatic CTS annotation and retrieval.

Findings

01

Conditioning on CTS improves factual accuracy.

02

Distant labeling achieves strong performance with less annotation effort.

03

Keyword-based retrieval makes full-text grounded citation generation practical.

Abstract

An automatic citation generation system aims to concisely and accurately describe the relationship between two scientific articles. To do so, such a system must ground its outputs to the content of the cited paper to avoid non-factual hallucinations. Due to the length of scientific documents, existing abstractive approaches have conditioned only on cited paper abstracts. We demonstrate empirically that the abstract is not always the most appropriate input for citation generation and that models trained in this way learn to hallucinate. We propose to condition instead on the cited text span (CTS) as an alternative to the abstract. Because manual CTS annotation is extremely time- and labor-intensive, we experiment with distant labeling of candidate CTS sentences, achieving sufficiently strong performance to substitute for expensive human annotations in model training, and we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacklxc/cts4citationtextgeneration
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies · Biomedical Text Mining and Ontologies