DiVA: Fine-grained Factuality Verification with Agentic-Discriminative Verifier
Hui Huang, Muyun Yang, Yuki Arase

TL;DR
DiVA is a hybrid framework combining generative and discriminative models to enable fine-grained factuality verification, addressing the limitations of binary correctness judgments in evaluating large language models.
Contribution
We introduce DiVA, a novel hybrid agentic-discriminative framework, and create FGVeriBench, a benchmark for fine-grained factuality verification, advancing the evaluation of LLMs.
Findings
DiVA outperforms existing methods on FGVeriBench.
DiVA effectively handles both general and multi-hop questions.
The benchmark enables detailed assessment of factuality errors.
Abstract
Despite the significant advancements of Large Language Models (LLMs), their factuality remains a critical challenge, fueling growing interest in factuality verification. Existing research on factuality verification primarily conducts binary judgments (e.g., correct or incorrect), which fails to distinguish varying degrees of error severity. This limits its utility for applications such as fine-grained evaluation and preference optimization. To bridge this gap, we propose the Agentic Discriminative Verifier (DiVA), a hybrid framework that synergizes the agentic search capabilities of generative models with the precise scoring aptitude of discriminative models. We also construct a new benchmark, FGVeriBench, as a robust testbed for fine-grained factuality verification. Experimental results on FGVeriBench demonstrate that our DiVA significantly outperforms existing methods on factuality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Text and Document Classification Technologies
