TL;DR
This study demonstrates that Twitter conversations, analyzed through a sentiment-involved topic-based approach, can significantly improve COVID-19 case forecasting accuracy, providing valuable early warning signals during pandemic waves.
Contribution
Introduces a novel sentiment-involved topic-based latent variable methodology for incorporating Twitter data into COVID-19 forecasting models, validated with Australian data.
Findings
Latent social media variables Granger-cause COVID-19 cases.
Inclusion of social media variables improves forecast accuracy by ~50%.
Provides a large-scale geotagged COVID-19 Twitter dataset, MegaGeoCOV.
Abstract
As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms such as Twitter and Weibo. The pandemic-specific discourse has remained on-trend on these platforms for months now. Previous studies have confirmed the contributions of such socially generated conversations towards situational awareness of crisis events. The early forecasts of cases are essential to authorities to estimate the requirements of resources needed to cope with the outgrowths of the virus. Therefore, this study attempts to incorporate the public discourse in the design of forecasting models particularly targeted for the steep-hill region of an ongoing wave. We propose a sentiment-involved topic-based latent variables search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
