A Stylometric Inquiry into Hyperpartisan and Fake News
Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff,, Benno Stein

TL;DR
This study analyzes the writing styles of hyperpartisan and fake news using a large, manually fact-checked corpus, revealing stylistic similarities within political extremes and the potential for style-based detection methods.
Contribution
It introduces a novel style similarity assessment method called Unmasking and applies it to distinguish hyperpartisan and fake news from mainstream sources.
Findings
Hyperpartisan news styles are more similar across political extremes than to mainstream news.
Style-based detection can distinguish hyperpartisan news from mainstream with high accuracy.
Fake news detection based solely on style has limited effectiveness.
Abstract
This paper reports on a writing style analysis of hyperpartisan (i.e., extremely one-sided) news in connection to fake news. It presents a large corpus of 1,627 articles that were manually fact-checked by professional journalists from BuzzFeed. The articles originated from 9 well-known political publishers, 3 each from the mainstream, the hyperpartisan left-wing, and the hyperpartisan right-wing. In sum, the corpus contains 299 fake news, 97% of which originated from hyperpartisan publishers. We propose and demonstrate a new way of assessing style similarity between text categories via Unmasking---a meta-learning approach originally devised for authorship verification---, revealing that the style of left-wing and right-wing news have a lot more in common than any of the two have with the mainstream. Furthermore, we show that hyperpartisan news can be discriminated well by its style…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Misinformation and Its Impacts · Spam and Phishing Detection
