No comments: Addressing commentary sections in websites' analyses
Florian Cafiero, Paul Guille-Escuret, Jeremy Ward

TL;DR
This paper investigates the impact of commentary sections on website analysis, demonstrating their potential to bias results and proposing guidelines for their removal or extraction, especially in controversial content like anti-vaccine sites.
Contribution
It introduces a systematic approach to identify and handle commentary sections in website analysis, highlighting their influence on bias and providing practical guidelines.
Findings
Commentary sections can bias website content analysis.
Analyzing commentary sections is valuable for understanding controversy.
Guidelines for removing or extracting commentary sections are proposed.
Abstract
Removing or extracting the commentary sections from a series of websites is a tedious task, as no standard way to code them is widely adopted. This operation is thus very rarely performed. In this paper, we show that these commentary sections can induce significant biases in the analyses, especially in the case of controversial Highlights Commentary sections can induce biases in the analysis of websites' contents Analyzing these sections can be interesting per se. We illustrate these points using a corpus of anti-vaccine websites. We provide guidelines to remove or extract these sections.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Topic Modeling
