Evaluating Evidence Attribution in Generated Fact Checking Explanations

Rui Xing; Timothy Baldwin; Jey Han Lau

arXiv:2406.12645·cs.CL·February 12, 2025

Evaluating Evidence Attribution in Generated Fact Checking Explanations

Rui Xing, Timothy Baldwin, Jey Han Lau

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes a new evaluation method for evidence attribution in fact-checking explanations, showing that current models often produce inaccurate attributions and emphasizing the importance of human-curated evidence.

Contribution

Introduces citation masking and recovery protocol for evaluating attribution quality, demonstrating that LLM-based annotation correlates with human judgment and highlighting the need for human-curated evidence.

Findings

01

LLMs' attribution quality correlates with human annotations

02

Current LLMs still generate explanations with inaccurate attributions

03

Human-curated evidence improves explanation quality

Abstract

Automated fact-checking systems often struggle with trustworthiness, as their generated explanations can include hallucinations. In this work, we explore evidence attribution for fact-checking explanation generation. We introduce a novel evaluation protocol -- citation masking and recovery -- to assess attribution quality in generated explanations. We implement our protocol using both human annotators and automatic annotators, and find that LLM annotation correlates with human annotation, suggesting that attribution assessment can be automated. Finally, our experiments reveal that: (1) the best-performing LLMs still generate explanations with inaccurate attributions; and (2) human-curated evidence is essential for generating better explanations. Code and data are available here: https://github.com/ruixing76/Transparent-FCExp.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruixing76/transparent-fcexp
noneOfficial

Videos

Evaluating Evidence Attribution in Generated Fact Checking Explanations· underline

Taxonomy

TopicsSoftware Engineering Research

MethodsFocus