Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation
Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi and, Heng Ji

TL;DR
This paper introduces a novel data generation framework that creates propaganda-informed training data to improve the detection of human-written fake news, bridging the gap between machine-generated and human disinformation.
Contribution
The authors propose a new framework for generating propaganda-loaded training data, resulting in the PropaNews dataset, which enhances fake news detection accuracy on real human-authored disinformation.
Findings
Fake news detectors trained on PropaNews improve detection accuracy by 3.62-7.69% F1 score.
The framework incorporates propaganda techniques like appeal to authority and loaded language.
The PropaNews dataset contains 2,256 examples for future research.
Abstract
Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation. What limits the successful transfer between them is the sizable gap between machine-generated fake news and human-authored ones, including the notable differences in terms of style and underlying intent. With this in mind, we propose a novel framework for generating training examples that are informed by the known styles and strategies of human-authored propaganda. Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles, while also incorporating propaganda techniques, such as appeal to authority and loaded language. In particular, we create a new training dataset, PropaNews, with 2,256 examples, which we release for future use. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques
