OpenFActScore: Open-Source Atomic Evaluation of Factuality in Text Generation

Lucas Fonseca Lage; Simon Ostermann

arXiv:2507.05965·cs.CL·July 9, 2025

OpenFActScore: Open-Source Atomic Evaluation of Factuality in Text Generation

Lucas Fonseca Lage, Simon Ostermann

PDF

Open Access 1 Repo

TL;DR

OpenFActScore is an open-source framework that evaluates the factual accuracy of text generated by large language models using atomic fact extraction and validation, promoting transparency and reproducibility.

Contribution

It adapts the FActScore framework to support open-source models, enabling broader access and reproducibility in factuality evaluation of LLM outputs.

Findings

01

Open models can approximate closed-source system performance.

02

Gemma achieved the best overall performance among open models.

03

Final setup correlates highly (0.99 Pearson) with original FActScore results.

Abstract

We introduce OpenFActScore, an open-source implementation of the FActScore framework for evaluating the factuality of text generated by large language models (LLMs). FActScore evaluates the factual accuracy of long-form text by using Atomic Fact Generation (AFG) to extract individual factual claims and Atomic Fact Validation (AFV) to verify each claim against a trusted knowledge source. While the original FActScore relies on closed-source and commercial models such as InstructGPT and ChatGPT, OpenFActScore enables the use of any Hugging Face-compatible model for both AFG and AFV. We provide a detailed technical overview of our implementation, highlighting design choices and modifications made to support open models. We evaluate multiple open-source LLMs on both AFG and AFV using the original FActScore benchmark, reporting BERTScore-F1 for AFG and Error Rate relative to human annotations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lflage/openfactscore
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Artificial Intelligence in Healthcare and Education · Topic Modeling