Design and Implementation of a Scalable Clinical Data Warehouse for Resource-Constrained Healthcare Systems
Shovito Barua Soumma, Fahim Shahriar, Umme Niraj Mahi, Md Hasin Abrar,, Md Abdur Rahman Fahad, Abu Sayed Md. Latiful Hoque

TL;DR
This paper presents a scalable, privacy-preserving clinical data warehouse designed for resource-limited healthcare systems, enabling effective integration, analysis, and outbreak prediction from heterogeneous electronic health records.
Contribution
It introduces a novel framework with automated data ingestion, patient identity resolution, and disease-specific data marts tailored for developing countries' healthcare infrastructure.
Findings
NoSQL outperforms relational databases by 40-69% in query processing.
The system can handle 19 million records daily, totaling 34TB over 5 years.
The framework effectively supports infectious disease management and outbreak prediction.
Abstract
Centralized electronic health record repositories are critical for advancing disease surveillance, public health research, and evidence-based policymaking. However, developing countries face persistent challenges in achieving this due to fragmented healthcare data sources, inconsistent record-keeping practices, and the absence of standardized patient identifiers, limiting reliable record linkage, compromise data interoperability, and limit scalability-obstacles exacerbated by infrastructural constraints and privacy concerns. To address these barriers, this study proposes a scalable, privacy-preserving clinical data warehouse, NCDW, designed for heterogeneous EHR integration in resource-limited settings and tested with 1.16 million clinical records. The framework incorporates a wrapper-based data acquisition layer for secure, automated ingestion of multisource health data and introduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence · Artificial Intelligence in Healthcare · Service-Oriented Architecture and Web Services
