Faking Fake News for Real Fake News Detection: Propaganda-loaded   Training Data Generation

Kung-Hsiang Huang; Kathleen McKeown; Preslav Nakov; Yejin Choi and; Heng Ji

arXiv:2203.05386·cs.CL·May 17, 2023·5 cites

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi and, Heng Ji

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel data generation framework that creates propaganda-informed training data to improve the detection of human-written fake news, bridging the gap between machine-generated and human disinformation.

Contribution

The authors propose a new framework for generating propaganda-loaded training data, resulting in the PropaNews dataset, which enhances fake news detection accuracy on real human-authored disinformation.

Findings

01

Fake news detectors trained on PropaNews improve detection accuracy by 3.62-7.69% F1 score.

02

The framework incorporates propaganda techniques like appeal to authority and loaded language.

03

The PropaNews dataset contains 2,256 examples for future research.

Abstract

Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation. What limits the successful transfer between them is the sizable gap between machine-generated fake news and human-authored ones, including the notable differences in terms of style and underlying intent. With this in mind, we propose a novel framework for generating training examples that are informed by the known styles and strategies of human-authored propaganda. Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles, while also incorporating propaganda techniques, such as appeal to authority and loaded language. In particular, we create a new training dataset, PropaNews, with 2,256 examples, which we release for future use. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

khuangaf/fakingfakenews
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques