Collaborative Cloud Computing Framework for Health Data with Open Source   Technologies

Fatemeh Rouzbeh; Ananth Grama; Paul Griffin; Mohammad Adibuzzaman

arXiv:2007.10498·cs.DC·July 28, 2020

Collaborative Cloud Computing Framework for Health Data with Open Source Technologies

Fatemeh Rouzbeh, Ananth Grama, Paul Griffin, Mohammad Adibuzzaman

PDF

TL;DR

This paper presents a new open-source cloud computing framework tailored for health data that addresses performance, flexibility, scalability, and privacy compliance challenges in scientific research.

Contribution

It introduces a novel architecture leveraging open source tools like Hadoop, Kubernetes, and JupyterHub for health data analysis in a distributed environment.

Findings

01

System successfully processed 69 million patient records.

02

Framework improved data manipulation and query performance.

03

Ensured HIPAA compliance in a scalable cloud setup.

Abstract

The proliferation of sensor technologies and advancements in data collection methods have enabled the accumulation of very large amounts of data. Increasingly, these datasets are considered for scientific research. However, the design of the system architecture to achieve high performance in terms of parallelization, query processing time, aggregation of heterogeneous data types (e.g., time series, images, structured data, among others), and difficulty in reproducing scientific research remain a major challenge. This is specifically true for health sciences research, where the systems must be i) easy to use with the flexibility to manipulate data at the most granular level, ii) agnostic of programming language kernel, iii) scalable, and iv) compliant with the HIPAA privacy law. In this paper, we review the existing literature for such big data systems for scientific research in health…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.