Robust Fake News Detection using Large Language Models under Adversarial Sentiment Attacks
Sahar Tahmasebi, Eric M\"uller-Budack, Ralph Ewerth

TL;DR
This paper examines the vulnerability of fake news detection models to sentiment manipulation by adversaries using LLMs and introduces AdSent, a framework that enhances robustness by making detection sentiment-agnostic.
Contribution
The study reveals the impact of sentiment shifts on detection accuracy and proposes a novel training strategy to improve robustness against sentiment-based adversarial attacks.
Findings
Sentiment manipulation significantly affects detection performance.
Neutral articles are often correctly identified as real, non-neutral as fake.
AdSent outperforms baselines in accuracy and robustness across datasets.
Abstract
Misinformation and fake news have become a pressing societal challenge, driving the need for reliable automated detection methods. Prior research has highlighted sentiment as an important signal in fake news detection, either by analyzing which sentiments are associated with fake news or by using sentiment and emotion features for classification. However, this poses a vulnerability since adversaries can manipulate sentiment to evade detectors especially with the advent of large language models (LLMs). A few studies have explored adversarial samples generated by LLMs, but they mainly focus on stylistic features such as writing style of news publishers. Thus, the crucial vulnerability of sentiment manipulation remains largely unexplored. In this paper, we investigate the robustness of state-of-the-art fake news detectors under sentiment manipulation. We introduce AdSent, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Sentiment Analysis and Opinion Mining · Spam and Phishing Detection
