The Perils & Promises of Fact-checking with Large Language Models
Dorian Quelle, Alexandre Bovet

TL;DR
This paper evaluates the capabilities and limitations of Large Language Models like GPT-4 in automated fact-checking, emphasizing the importance of context, source citation, and cautious application due to inconsistent accuracy.
Contribution
It introduces a framework for LLM-based fact-checking that includes reasoning and source citation, and compares GPT-4's performance to GPT-3 across various conditions.
Findings
GPT-4 outperforms GPT-3 in fact-checking accuracy
Contextual information significantly improves LLM performance
Accuracy varies with query language and claim veracity
Abstract
Automated fact-checking, using machine learning to verify claims, has grown vital as misinformation spreads beyond human fact-checking capacity. Large Language Models (LLMs) like GPT-4 are increasingly trusted to write academic papers, lawsuits, and news articles and to verify information, emphasizing their role in discerning truth from falsehood and the importance of being able to verify their outputs. Understanding the capacities and limitations of LLMs in fact-checking tasks is therefore essential for ensuring the health of our information ecosystem. Here, we evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions. Importantly, in our framework, agents explain their reasoning and cite the relevant sources from the retrieved context. Our results show the enhanced prowess of LLMs when equipped with contextual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Softmax · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Cosine Annealing · Absolute Position Encodings
