Tell Me Why: Explainable Public Health Fact-Checking with Large Language   Models

Majid Zarharan; Pascal Wullschleger; Babak Behkam Kia; Mohammad Taher; Pilehvar; Jennifer Foster

arXiv:2405.09454·cs.CL·December 19, 2024

Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models

Majid Zarharan, Pascal Wullschleger, Babak Behkam Kia, Mohammad Taher, Pilehvar, Jennifer Foster

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates large language models' ability to verify and explain public health claims, comparing prompting and fine-tuning methods, and introduces a dual automatic and human evaluation approach.

Contribution

It provides a comprehensive analysis of explainable fact-checking with LLMs, highlighting performance differences across prompting and fine-tuning, and introduces a novel human evaluation framework.

Findings

01

GPT-4 excels in zero-shot verification and explanation.

02

Open-source models can match or surpass GPT-4 with fine-tuning.

03

Human evaluation uncovers issues with gold explanations.

Abstract

This paper presents a comprehensive analysis of explainable fact-checking through a series of experiments, focusing on the ability of large language models to verify public health claims and provide explanations or justifications for their veracity assessments. We examine the effectiveness of zero/few-shot prompting and parameter-efficient fine-tuning across various open and closed-source models, examining their performance in both isolated and joint tasks of veracity prediction and explanation generation. Importantly, we employ a dual evaluation approach comprising previously established automatic metrics and a novel set of criteria through human evaluation. Our automatic evaluation indicates that, within the zero-shot scenario, GPT-4 emerges as the standout performer, but in few-shot and parameter-efficient fine-tuning contexts, open-source models demonstrate their capacity to not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zarharan/NLE-for-fact-checking
pytorchOfficial

Videos

Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare

MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Multi-Head Attention · Dense Connections · Position-Wise Feed-Forward Layer · Dropout · Label Smoothing · Residual Connection · Absolute Position Encodings