Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild
Alexander Loth, Martin Kappes, Marc-Oliver Pahl

TL;DR
This study investigates human susceptibility to AI-generated misinformation using causal models, revealing that familiarity with fake news, not political bias, influences detection ability and highlighting a 'fluency trap' in AI outputs.
Contribution
Introduces JudgeGPT and RogueGPT frameworks and applies causal analysis to understand factors affecting human detection of AI hallucinations and disinformation.
Findings
Fake news familiarity correlates with detection performance (r=0.35).
Political orientation shows negligible impact (r=-0.10).
GPT-4 outputs bypass source monitoring, creating a 'fluency trap'.
Abstract
As foundation models (FMs) approach human-level fluency, distinguishing synthetic from organic content has become a key challenge for Trustworthy Web Intelligence. This paper presents JudgeGPT and RogueGPT, a dual-axis framework that decouples "authenticity" from "attribution" to investigate the mechanisms of human susceptibility. Analyzing 918 evaluations across five FMs (including GPT-4 and Llama-2), we employ Structural Causal Models (SCMs) as a principal framework for formulating testable causal hypotheses about detection accuracy. Contrary to partisan narratives, we find that political orientation shows a negligible association with detection performance (). Instead, "fake news familiarity" emerges as a candidate mediator (), suggesting that exposure may function as adversarial training for human discriminators. We identify a "fluency trap" where GPT-4 outputs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Deception detection and forensic psychology · Privacy, Security, and Data Protection
