The Perils & Promises of Fact-checking with Large Language Models

Dorian Quelle; Alexandre Bovet

arXiv:2310.13549·cs.CL·February 8, 2024·2 cites

The Perils & Promises of Fact-checking with Large Language Models

Dorian Quelle, Alexandre Bovet

PDF

Open Access

TL;DR

This paper evaluates the capabilities and limitations of Large Language Models like GPT-4 in automated fact-checking, emphasizing the importance of context, source citation, and cautious application due to inconsistent accuracy.

Contribution

It introduces a framework for LLM-based fact-checking that includes reasoning and source citation, and compares GPT-4's performance to GPT-3 across various conditions.

Findings

01

GPT-4 outperforms GPT-3 in fact-checking accuracy

02

Contextual information significantly improves LLM performance

03

Accuracy varies with query language and claim veracity

Abstract

Automated fact-checking, using machine learning to verify claims, has grown vital as misinformation spreads beyond human fact-checking capacity. Large Language Models (LLMs) like GPT-4 are increasingly trusted to write academic papers, lawsuits, and news articles and to verify information, emphasizing their role in discerning truth from falsehood and the importance of being able to verify their outputs. Understanding the capacities and limitations of LLMs in fact-checking tasks is therefore essential for ensuring the health of our information ecosystem. Here, we evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions. Importantly, in our framework, agents explain their reasoning and cite the relevant sources from the retrieved context. Our results show the enhanced prowess of LLMs when equipped with contextual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Softmax · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Cosine Annealing · Absolute Position Encodings