TL;DR
This paper presents a novel language-agnostic machine learning approach using Wikipedia page views and algorithms like Personalized PageRank and CycleRank to accurately estimate influenza-like illness prevalence across multiple European countries.
Contribution
It introduces a new automatic method for selecting relevant Wikipedia pages for disease monitoring, improving estimation accuracy without expert input.
Findings
Achieved state-of-the-art influenza prevalence estimation results
Demonstrated effectiveness across four European countries
Validated the approach's language independence and automation
Abstract
Influenza is an acute respiratory seasonal disease that affects millions of people worldwide and causes thousands of deaths in Europe alone. Being able to estimate in a fast and reliable way the impact of an illness on a given country is essential to plan and organize effective countermeasures, which is now possible by leveraging unconventional data sources like web searches and visits. In this study, we show the feasibility of exploiting information about Wikipedia's page views of a selected group of articles and machine learning models to obtain accurate estimates of influenza-like illnesses incidence in four European countries: Italy, Germany, Belgium, and the Netherlands. We propose a novel language-agnostic method, based on two algorithms, Personalized PageRank and CycleRank, to automatically select the most relevant Wikipedia pages to be monitored without the need for expert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
