Scalable Fault-Tolerant Data Feeds in AsterixDB
Raman Grover, Michael J. Carey

TL;DR
This paper presents AsterixDB's integrated support for scalable, fault-tolerant data feeds, enabling continuous ingestion and indexing of high-velocity semi-structured data within a unified system.
Contribution
It introduces native data feed support in AsterixDB, detailing its design, implementation, and customization options for scalable, fault-tolerant data ingestion.
Findings
Demonstrates scalability of data feeds in AsterixDB
Shows fault-tolerance capabilities through initial experiments
Highlights customization options for resource allocation
Abstract
In this paper we describe the support for data feed ingestion in AsterixDB, an open-source Big Data Management System (BDMS) that provides a platform for storage and analysis of large volumes of semi-structured data. Data feeds are a mechanism for having continuous data arrive into a BDMS from external sources and incrementally populate a persisted dataset and associated indexes. The need to persist and index "fast-flowing" high-velocity data (and support ad hoc analytical queries) is ubiquitous. However, the state of the art today involves 'gluing' together different systems. AsterixDB is different in being a unified system with "native support" for data feed ingestion. We discuss the challenges and present the design and implementation of the concepts involved in modeling and managing data feeds in AsterixDB. AsterixDB allows the runtime behavior, allocation of resources and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed systems and fault tolerance · Advanced Data Storage Technologies
