VeriFastScore: Speeding up long-form factuality evaluation

Rishanth Rajendhran; Amir Zadeh; Matthew Sarte; Chuan Li; Mohit Iyyer

arXiv:2505.16973·cs.CL·November 3, 2025

VeriFastScore: Speeding up long-form factuality evaluation

Rishanth Rajendhran, Amir Zadeh, Matthew Sarte, Chuan Li, Mohit Iyyer

PDF

Open Access 1 Repo 1 Models 1 Datasets 1 Video

TL;DR

VeriFastScore is a fine-tuned Llama3.1 model that efficiently evaluates long-form factuality by extracting and verifying claims simultaneously, achieving significant speedups while maintaining high correlation with traditional methods.

Contribution

We introduce VeriFastScore, a novel fine-tuned model that speeds up long-form factuality evaluation by handling multiple claims at once, reducing reliance on numerous LLM calls.

Findings

01

Achieves 6.6x speedup over VeriScore

02

Correlates strongly with VeriScore (r=0.80 example, 0.94 system)

03

Can process complex evidence with ~4K tokens

Abstract

Metrics like FactScore and VeriScore that evaluate long-form factuality operate by decomposing an input response into atomic claims and then individually verifying each claim. While effective and interpretable, these methods incur numerous LLM calls and can take upwards of 100 seconds to evaluate a single response, limiting their practicality in large-scale evaluation and training scenarios. To address this, we propose VeriFastScore, which leverages synthetic data to fine-tune Llama3.1 8B for simultaneously extracting and verifying all verifiable claims within a given text based on evidence from Google Search. We show that this task cannot be solved via few-shot prompting with closed LLMs due to its complexity: the model receives ~4K tokens of evidence on average and needs to concurrently decompose claims, judge their verifiability, and verify them against noisy evidence. However, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rishanthrajendhran/verifastscore
pytorchOfficial

Models

🤗
rishanthrajendhran/VeriFastScore
model· 18 dl
18 dl

Datasets

rishanthrajendhran/VeriFastScore
dataset· 78 dl
78 dl

Videos

VeriFastScore: Speeding up long-form factuality evaluation· underline

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · Explainable Artificial Intelligence (XAI)