FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations

Yixing Peng; Licheng Zhang; Shancheng Fang; Yi Liu; Peijian Gu; Quan Wang

arXiv:2602.18437·cs.IR·February 24, 2026

FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations

Yixing Peng, Licheng Zhang, Shancheng Fang, Yi Liu, Peijian Gu, Quan Wang

PDF

Open Access 1 Video

TL;DR

FineRef introduces a novel two-stage training framework enabling LLMs to self-identify and correct citation errors, significantly improving citation accuracy and answer quality in long-form generation with citations.

Contribution

The paper presents FineRef, a new error reflection and correction framework with a two-stage training strategy, enhancing citation fidelity and robustness in LLM-generated long-form answers.

Findings

01

FineRef outperforms GPT-4 by up to 18% in Citation F1.

02

It surpasses state-of-the-art models in answer accuracy.

03

Demonstrates strong generalization in noisy and domain transfer scenarios.

Abstract

Generating with citations is crucial for trustworthy Large Language Models (LLMs), yet even advanced LLMs often produce mismatched or irrelevant citations. Existing methods over-optimize citation fidelity while overlooking relevance to the user query, which degrades answer quality and robustness in real-world settings with noisy or irrelevant retrieved content. Moreover, the prevailing single-pass paradigm struggles to deliver optimal answers in long-form generation that requiring multiple citations. To address these limitations, we propose FineRef, a framework based on Fine-grained error Reflection, which explicitly teaches the model to self-identify and correct two key citation errors, mismatch and irrelevance, on a per-citation basis. FineRef follows a two-stage training strategy. The first stage instills an "attempt-reflect-correct" behavioral pattern via supervised fine-tuning,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations· underline

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Expert finding and Q&A systems