GenProve: Learning to Generate Text with Fine-Grained Provenance

Jingxuan Wei; Xingyue Wang; Yanghaoyu Liao; Jie Dong; Yuchen Liu; Caijun Jia; Bihui Yu; Junnan Zhu

arXiv:2601.04932·cs.CL·April 14, 2026

GenProve: Learning to Generate Text with Fine-Grained Provenance

Jingxuan Wei, Xingyue Wang, Yanghaoyu Liao, Jie Dong, Yuchen Liu, Caijun Jia, Bihui Yu, Junnan Zhu

PDF

1 Datasets

TL;DR

GenProve introduces a new framework and dataset for generating text with detailed, sentence-level provenance to improve accountability and distinguish between quoting and reasoning in language models.

Contribution

The paper presents ReFInE dataset and GenProve framework, enabling models to produce structured provenance alongside answers, enhancing transparency and reasoning capabilities.

Findings

01

GenProve outperforms 14 strong LLMs in joint answer and provenance accuracy.

02

Models excel at quoting but struggle with inference-based provenance.

03

The approach improves accountability by distinguishing between different types of evidence.

Abstract

Large language models (LLM) often hallucinate, and while adding citations is a common solution, it is frequently insufficient for accountability as users struggle to verify how a cited source supports a generated claim. Existing methods are typically coarse-grained and fail to distinguish between direct quotes and complex reasoning. In this paper, we introduce Generation-time Fine-grained Provenance, a task where models must generate fluent answers while simultaneously producing structured, sentence-level provenance triples. To enable this, we present ReFInE (Relation-aware Fine-grained Interpretability & Evidence), a dataset featuring expert verified annotations that distinguish between Quotation, Compression, and Inference. Building on ReFInE, we propose GenProve, a framework that combines Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO). By optimizing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

liaoyanghaoyu/ReFInE
dataset

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.