Defending Against Neural Fake News
Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali, Farhadi, Franziska Roesner, Yejin Choi

TL;DR
This paper introduces Grover, a controllable text generation model for neural fake news, evaluates detection methods, and discusses ethical implications, emphasizing the importance of releasing strong generators for improved detection.
Contribution
The paper presents Grover, a new model for generating neural fake news, and analyzes detection techniques, highlighting the effectiveness of using the generator itself as a defense.
Findings
Discriminators achieve 73% accuracy in detecting neural fake news.
Grover can generate highly trustworthy-looking fake news articles.
Using Grover as a detector yields 92% accuracy, emphasizing the importance of releasing strong generators.
Abstract
Recent progress in natural language generation has raised dual-use concerns. While applications like summarization and translation are positive, the underlying technology also might enable adversaries to generate neural fake news: targeted propaganda that closely mimics the style of real news. Modern computer security relies on careful threat modeling: identifying potential threats and vulnerabilities from an adversary's point of view, and exploring potential mitigations to these threats. Likewise, developing robust defenses against neural fake news requires us first to carefully investigate and characterize the risks of these models. We thus present a model for controllable text generation called Grover. Given a headline like `Link Found Between Vaccines and Autism,' Grover can generate the rest of the article; humans find these generations to be more trustworthy than human-written…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Topic Modeling
