GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Yuehong Cassandra Tai; Khushi Navin Patni; Nicholas Daniel Hemauer,; Bruce Desmarais; and Yu-Ru Lin

arXiv:2502.14943·cs.AI·February 27, 2025

GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Yuehong Cassandra Tai, Khushi Navin Patni, Nicholas Daniel Hemauer,, Bruce Desmarais, and Yu-Ru Lin

PDF

Open Access

TL;DR

This study evaluates GenAI models' ability to assess content credibility, revealing moderate agreement with humans and highlighting their reliance on linguistic cues rather than true understanding, with implications for fact-checking support.

Contribution

It provides a comprehensive assessment of GenAI models' performance in credibility evaluation and analyzes their reasoning patterns, emphasizing limitations and potential for supporting human fact-checkers.

Findings

01

GPT-4o outperforms other models in credibility ratings

02

All models rely heavily on linguistic features rather than true veracity understanding

03

Summarized content can improve efficiency without reducing accuracy

Abstract

Despite recent advances in understanding the capabilities and limits of generative artificial intelligence (GenAI) models, we are just beginning to understand their capacity to assess and reason about the veracity of content. We evaluate multiple GenAI models across tasks that involve the rating of, and perceived reasoning about, the credibility of information. The information in our experiments comes from content that subnational U.S. politicians post to Facebook. We find that GPT-4o, one of the most used AI models in consumer applications, outperforms other models, but all models exhibit only moderate agreement with human coders. Importantly, even when GenAI models accurately identify low-credibility content, their reasoning relies heavily on linguistic features and ``hard'' criteria, such as the level of detail, source reliability, and language formality, rather than an understanding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)