FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking   Evaluation of Large Language Models

Hongzhan Lin; Yang Deng; Yuxuan Gu; Wenxuan Zhang; Jing Ma; See-Kiong; Ng; Tat-Seng Chua

arXiv:2502.17924·cs.CL·March 4, 2025

FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models

Hongzhan Lin, Yang Deng, Yuxuan Gu, Wenxuan Zhang, Jing Ma, See-Kiong, Ng, Tat-Seng Chua

PDF

Open Access 1 Repo 1 Video

TL;DR

FACT-AUDIT is a dynamic, multi-agent framework that adaptively evaluates large language models' fact-checking abilities, including justification quality, providing a comprehensive and evolving assessment of their trustworthiness.

Contribution

This work introduces FACT-AUDIT, a novel adaptive, multi-agent framework that assesses LLMs' fact-checking performance beyond static datasets by incorporating justification analysis and iterative evaluation.

Findings

01

Effectively differentiates among state-of-the-art LLMs.

02

Provides insights into models' strengths and limitations.

03

Enhances fact-checking evaluation with dynamic, model-centric assessments.

Abstract

Large Language Models (LLMs) have significantly advanced the fact-checking studies. However, existing automated fact-checking evaluation methods rely on static datasets and classification metrics, which fail to automatically evaluate the justification production and uncover the nuanced limitations of LLMs in fact-checking. In this work, we introduce FACT-AUDIT, an agent-driven framework that adaptively and dynamically assesses LLMs' fact-checking capabilities. Leveraging importance sampling principles and multi-agent collaboration, FACT-AUDIT generates adaptive and scalable datasets, performs iterative model-centric evaluations, and updates assessments based on model-specific responses. By incorporating justification production alongside verdict prediction, this framework provides a comprehensive and evolving audit of LLMs' factual reasoning capabilities, to investigate their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DanielLin97/FACT-AUDIT
noneOfficial

Videos

FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models· underline

Taxonomy

TopicsTopic Modeling