Using Grok to Avoid Personal Attacks While Correcting Misinformation on X
Kevin Matthe Caramancion

TL;DR
This paper shows that using the native large language model Grok on X to correct misinformation significantly reduces hostile ad hominem attacks compared to direct human responses, potentially improving online discourse.
Contribution
It provides empirical evidence that AI-mediated corrections via Grok decrease hostility in misinformation responses, a novel approach to improving online interactions.
Findings
72% of human corrections received ad hominem attacks
Grok-mediated corrections received no ad hominem attacks
Statistically significant reduction in hostility with AI mediation
Abstract
Correcting misinformation in public online spaces often exposes users to hostility and ad hominem attacks, discouraging participation in corrective discourse. This study presents empirical evidence that invoking Grok, the native large language model on X, rather than directly confronting other users, is associated with different social responses during misinformation correction. Using an observational design, 100 correction replies across five high-conflict misinformation topics were analyzed, with corrections balanced between Grok-mediated and direct human-issued responses. The primary outcome was whether a correction received at least one ad hominem attack within a 24-hour window. Ad hominem attacks occurred in 72 percent of human-issued corrections and in none of the Grok-mediated corrections. A chi-square test confirmed a statistically significant association with a large effect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Deception detection and forensic psychology · Psychological and Educational Research Studies
