Interrupted time series analysis of clickbait on worldwide news websites, 2016-2023
Austin McCutcheon, Chris Brogly

TL;DR
This study analyzes the prevalence and variation of clickbait on worldwide news websites from 2016 to 2023, revealing how major events like COVID-19 and the US Election affected clickbait levels.
Contribution
It introduces a large-scale dataset of clickbait scores and applies segmented regression models to analyze temporal changes related to major news events.
Findings
Clickbait levels varied significantly during major news events.
COVID-19 and the 2020 US Election influenced clickbait prevalence.
The dataset enables large-scale analysis of clickbait trends over time.
Abstract
Clickbait is deceptive text that can manipulate web browsing, creating an information gap between a link and target page that literally baits a user into clicking. Clickbait detection continues to be well studied, but analyses of clickbait overall on the web are limited. A dataset was built consisting of 451,033,388 clickbait scores produced by a clickbait detector which analyzed links and headings on primarily English news pages from the Common Crawl. On this data, 5 segmented regression models were fit on 5 major news events and averaged clickbait scores. COVID and the 2020 US Election appeared to influence clickbait levels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Communication and COVID-19 Impact · Social Media in Health Education
