FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations
Yixing Peng, Licheng Zhang, Shancheng Fang, Yi Liu, Peijian Gu, Quan Wang

TL;DR
FineRef introduces a novel two-stage training framework enabling LLMs to self-identify and correct citation errors, significantly improving citation accuracy and answer quality in long-form generation with citations.
Contribution
The paper presents FineRef, a new error reflection and correction framework with a two-stage training strategy, enhancing citation fidelity and robustness in LLM-generated long-form answers.
Findings
FineRef outperforms GPT-4 by up to 18% in Citation F1.
It surpasses state-of-the-art models in answer accuracy.
Demonstrates strong generalization in noisy and domain transfer scenarios.
Abstract
Generating with citations is crucial for trustworthy Large Language Models (LLMs), yet even advanced LLMs often produce mismatched or irrelevant citations. Existing methods over-optimize citation fidelity while overlooking relevance to the user query, which degrades answer quality and robustness in real-world settings with noisy or irrelevant retrieved content. Moreover, the prevailing single-pass paradigm struggles to deliver optimal answers in long-form generation that requiring multiple citations. To address these limitations, we propose FineRef, a framework based on Fine-grained error Reflection, which explicitly teaches the model to self-identify and correct two key citation errors, mismatch and irrelevance, on a per-citation basis. FineRef follows a two-stage training strategy. The first stage instills an "attempt-reflect-correct" behavioral pattern via supervised fine-tuning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Machine Learning in Materials Science · Expert finding and Q&A systems
