On the Capacity of Citation Generation by Large Language Models

Haosheng Qian; Yixing Fan; Ruqing Zhang; Jiafeng Guo

arXiv:2410.11217·cs.CL·October 16, 2024

On the Capacity of Citation Generation by Large Language Models

Haosheng Qian, Yixing Fan, Ruqing Zhang, Jiafeng Guo

PDF

Open Access

TL;DR

This paper systematically analyzes large language models' ability to generate accurate citations, introduces new evaluation metrics, and proposes a Generate-then-Refine method to improve citation quality in responses.

Contribution

It provides a comprehensive evaluation of LLMs' citation generation, introduces novel metrics, and proposes a method to enhance citation accuracy without changing response content.

Findings

01

The proposed method significantly improves citation quality in LLM-generated responses.

02

New citation evaluation metrics better assess citation relevance and correctness.

03

Large language models can be effectively guided to generate more accurate citations.

Abstract

Retrieval-augmented generation (RAG) appears as a promising method to alleviate the "hallucination" problem in large language models (LLMs), since it can incorporate external traceable resources for response generation. The essence of RAG in combating the hallucination issue lies in accurately attributing claims in responses to the corresponding retrieved documents. However, most of existing works focus on improving the quality of generated responses from the LLM, while largely overlooked its ability to attribute sources accurately. In this study, we conduct a systematic analysis about the capabilities of LLMs in generating citations within response generation, and further introduce a novel method to enhance their citation generation abilities. Specifically, we evaluate both the correctness and citation quality for seven widely-used LLMs on two benchmark datasets. Meanwhile, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · WordPiece · Residual Connection · Linear Warmup With Linear Decay · Dropout · Layer Normalization