D4M: Bringing Associative Arrays to Database Engines
Vijay Gadepally, Jeremy Kepner, William Arcand, David Bestor, Bill, Bergeron, Chansup Byun, Lauren Edwards, Matthew Hubbell, Peter Michaleas,, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Albert Reuther

TL;DR
D4M is introduced as a unifying tool that enables associative array operations across multiple database engines, simplifying data analysis workflows and supporting various backend systems like SciDB and Accumulo.
Contribution
The paper presents D4M's extension to support multiple database engines, including SciDB, providing a federated interface for associative array operations.
Findings
D4M now supports SciDB alongside Accumulo.
Performance benchmarks demonstrate efficient data operations.
Unified interface simplifies multi-database data analysis.
Abstract
The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Numerous tools exist that allow users to store, query and index these massive quantities of data. Each storage or database engine comes with the promise of dealing with complex data. Scientists and engineers who wish to use these systems often quickly find that there is no single technology that offers a panacea to the complexity of information. When using multiple technologies, however, there is significant trouble in designing the movement of information between storage and database engines to support an end-to-end application along with a steep learning curve associated with learning the nuances of each underlying technology. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
