Identifying Misinformation from Website Screenshots
Sara Abdali, Rutuja Gurav, Siddharth Menon, Daniel Fonseca, Negin, Entezari, Neil Shah, Evangelos E. Papalexakis

TL;DR
This paper introduces VizFake, a semi-supervised method using website screenshots and tensor decomposition to detect misinformation, achieving high accuracy with minimal labels and outperforming deep learning in speed and simplicity.
Contribution
The paper presents VizFake, a novel approach leveraging tensor decomposition on website screenshots for misinformation detection, effective with scarce labels and computationally efficient.
Findings
Achieves roughly 85% F1 score with only 5% labeled data.
Insensitive to image transformations like grayscale conversion.
Faster and simpler than deep transfer learning methods.
Abstract
Can the look and the feel of a website give information about the trustworthiness of an article? In this paper, we propose to use a promising, yet neglected aspect in detecting the misinformativeness: the overall look of the domain webpage. To capture this overall look, we take screenshots of news articles served by either misinformative or trustworthy web domains and leverage a tensor decomposition based semi-supervised classification technique. The proposed approach i.e., VizFake is insensitive to a number of image transformations such as converting the image to grayscale, vectorizing the image and losing some parts of the screenshots. VizFake leverages a very small amount of known labels, mirroring realistic and practical scenarios, where labels (especially for known misinformative articles), are scarce and quickly become dated. The F1 score of VizFake on a dataset of 50k screenshots…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Spam and Phishing Detection
