A Scalable and Robust Framework for Data Stream Ingestion
Haruna Isah, Farhana Zulkernine

TL;DR
This paper presents a scalable, fault-tolerant framework for ingesting and integrating high-velocity data streams from diverse sources, demonstrated through a real-world case study involving Apache NiFi and Kafka.
Contribution
It introduces a novel, reusable data stream ingestion framework that enhances scalability and robustness, addressing key challenges in processing continuous, high-volume data streams.
Findings
Framework effectively handles diverse data sources
Demonstrated robustness in real-world case study
Identified best practices and future research gaps
Abstract
An essential part of building a data-driven organization is the ability to handle and process continuous streams of data to discover actionable insights. The explosive growth of interconnected devices and the social Web has led to a large volume of data being generated on a continuous basis. Streaming data sources such as stock quotes, credit card transactions, trending news, traffic conditions, time-sensitive patients data are not only very common but can rapidly depreciate if not processed quickly. The ever-increasing volume and highly irregular nature of data rates pose new challenges to data stream processing systems. One such challenging but important task is how to accurately ingest and integrate data streams from various sources and locations into an analytics platform. These challenges demand new strategies and systems that can offer the desired degree of scalability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
