Variance of Twitter Embeddings and Temporal Trends of COVID-19 cases
Mayank Sethi, Ambika Sadhu, Khushbu Pahwa, Sargun Nagpal, Tavpritesh, Sethi

TL;DR
This paper explores how Twitter data and word embeddings can be used to predict COVID-19 case surges with a lead time of up to 30 days, aiding timely resource planning.
Contribution
It introduces a novel method leveraging Twitter embeddings and Significant Dimensions to forecast COVID-19 case increases with high accuracy.
Findings
Predicts COVID-19 case surges with 15 and 30-day lead times
Achieves R2 scores of 0.80 and 0.62 for predictions
Identifies thematic significance of embedding dimensions
Abstract
The severity of the coronavirus pandemic necessitates the need of effective administrative decisions. Over 4 lakh people in India succumbed to COVID-19, with over 3 crore confirmed cases, and still counting. The threat of a plausible third wave continues to haunt millions. In this ever changing dynamic of the virus, predictive modeling methods can serve as an integral tool. The pandemic has further triggered an unprecedented usage of social media. This paper aims to propose a method for harnessing social media, specifically Twitter, to predict the upcoming scenarios related to COVID-19 cases. In this study, we seek to understand how the surges in COVID-19 related tweets can indicate rise in the cases. This prospective analysis can be utilised to aid administrators about timely resource allocation to lessen the severity of the damage. Using word embeddings to capture the semantic meaning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Sentiment Analysis and Opinion Mining
