Scalable Analysis for Covid-19 and Vaccine Data

Chris Collins; Roxana Cuevas; Edward Hernandez; Reece Hernandez,; Breanna Le; Jongwook Woo

arXiv:2108.02898·cs.DC·August 9, 2021

Scalable Analysis for Covid-19 and Vaccine Data

Chris Collins, Roxana Cuevas, Edward Hernandez, Reece Hernandez,, Breanna Le, Jongwook Woo

PDF

Open Access

TL;DR

This paper demonstrates scalable methods using Big Data tools like Hadoop and Hive to analyze large Covid-19 and vaccine datasets, revealing correlations between vaccination rates and case reductions.

Contribution

It introduces scalable Big Data analysis techniques for Covid-19 data, enabling efficient processing and visualization of large-scale health data.

Findings

01

Higher vaccination rates correlate with fewer confirmed Covid-19 cases.

02

Big Data tools can effectively handle 3.2 GB Covid-19 datasets.

03

Visualizations aid in understanding the impact of vaccination.

Abstract

This paper explains the scalable methods used for extracting and analyzing the Covid-19 vaccine data. Using Big Data such as Hadoop and Hive, we collect and analyze the massive data set of the confirmed, the fatality, and the vaccination data set of Covid-19. The data size is about 3.2 Giga-Byte. We show that it is possible to store and process massive data with Big Data. The paper proceeds tempo-spatial analysis, and visual maps, charts, and pie charts visualize the result of the investigation. We illustrate that the more vaccinated, the fewer the confirmed cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Anomaly Detection Techniques and Applications · Data Stream Mining Techniques