Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang, Zeqiu Wu, Yushi Hu, Wenya Wang

TL;DR
This paper introduces a training framework with fine-grained rewards to improve LLMs in generating accurate, citation-supported responses, reducing hallucinations and enhancing credibility, especially for smaller models.
Contribution
It presents a novel reward-based training method that significantly improves citation generation and response correctness in LLMs, outperforming traditional training strategies.
Findings
Fine-grained rewards enhance citation relevance and support in LLMs.
The method outperforms baseline models, including GPT-3.5-turbo.
Improved performance demonstrated on ALCE and EXPERTQA datasets.
Abstract
While recent Large Language Models (LLMs) have proven useful in answering user queries, they are prone to hallucination, and their responses often lack credibility due to missing references to reliable sources. An intuitive solution to these issues would be to include in-text citations referring to external documents as evidence. While previous works have directly prompted LLMs to generate in-text citations, their performances are far from satisfactory, especially when it comes to smaller LLMs. In this work, we propose an effective training framework using fine-grained rewards to teach LLMs to generate highly supportive and relevant citations, while ensuring the correctness of their responses. We also conduct a systematic analysis of applying these fine-grained rewards to common LLM training strategies, demonstrating its advantage over conventional practices. We conduct extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Attention Dropout · Residual Connection
