Health Misinformation Detection in Web Content via Web2Vec: A Structural-, Content-based, and Context-aware Approach based on Web2Vec
Rishabh Upadhyay, Gabriella Pasi, and Marco Viviani

TL;DR
This paper introduces Web2Vec, a deep learning approach that combines structural, content, and context features to detect health misinformation on web pages, addressing a critical need for credible online health information.
Contribution
It proposes a novel integration of structural, content, and context features with Web2Vec embeddings for improved health misinformation detection on web pages.
Findings
Web2Vec effectively captures web page features for misinformation detection.
The approach outperforms traditional handcrafted feature methods.
Structural, content, and context features enhance detection accuracy.
Abstract
In recent years, we have witnessed the proliferation of large amounts of online content generated directly by users with virtually no form of external control, leading to the possible spread of misinformation. The search for effective solutions to this problem is still ongoing, and covers different areas of application, from opinion spam to fake news detection. A more recently investigated scenario, despite the serious risks that incurring disinformation could entail, is that of the online dissemination of health information. Early approaches in this area focused primarily on user-based studies applied to Web page content. More recently, automated approaches have been developed for both Web pages and social media content, particularly with the advent of the COVID-19 pandemic. These approaches are primarily based on handcrafted features extracted from online content in association with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
