Fake News in Sheep's Clothing: Robust Fake News Detection Against   LLM-Empowered Style Attacks

Jiaying Wu; Jiafeng Guo; Bryan Hooi

arXiv:2310.10830·cs.CL·August 21, 2024·5 cites

Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks

Jiaying Wu, Jiafeng Guo, Bryan Hooi

PDF

Open Access 1 Repo

TL;DR

This paper reveals that LLMs can craft style-mimicking fake news that defeats current detectors, and introduces SheepDog, a new style-robust fake news detector that emphasizes content over style for improved resilience.

Contribution

We propose SheepDog, a novel fake news detection method that is robust against style-based attacks by focusing on content and leveraging LLM-generated style variations during training.

Findings

01

SheepDog maintains high accuracy across style variations.

02

LLM-empowered style attacks reduce traditional detector performance by up to 38%.

03

SheepDog outperforms existing detectors in style robustness on real-world benchmarks.

Abstract

It is commonly perceived that fake news and real news exhibit distinct writing styles, such as the use of sensationalist versus objective language. However, we emphasize that style-related features can also be exploited for style-based attacks. Notably, the advent of powerful Large Language Models (LLMs) has empowered malicious actors to mimic the style of trustworthy news sources, doing so swiftly, cost-effectively, and at scale. Our analysis reveals that LLM-camouflaged fake news content significantly undermines the effectiveness of state-of-the-art text-based detectors (up to 38% decrease in F1 Score), implying a severe vulnerability to stylistic variations. To address this, we introduce SheepDog, a style-robust fake news detector that prioritizes content over style in determining news veracity. SheepDog achieves this resilience through (1) LLM-empowered news reframings that inject…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiayingwu19/sheepdog
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Sentiment Analysis and Opinion Mining