The BigDAWG Polystore System and Architecture
Vijay Gadepally, Peinan Chen, Jennie Duggan, Aaron Elmore, Brandon, Haynes, Jeremy Kepner, Samuel Madden, Tim Mattson, Michael Stonebraker

TL;DR
BigDAWG is a polystore system designed to manage complex, heterogeneous datasets across multiple database engines, improving performance and flexibility for diverse data types like medical data.
Contribution
The paper introduces the BigDAWG polystore architecture, enabling seamless integration of different database systems for complex, multi-model data management.
Findings
Prototype applied to medical dataset validates polystore concepts
Initial performance results show promising efficiency
Supports diverse data models with a unified interface
Abstract
Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficiency effects. As a part of the Intel Science and Technology Center on Big Data, we are developing a polystore system designed for such problems. BigDAWG (short for the Big Data Analytics Working Group) is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
