D4M 3.0: Extended Database and Language Capabilities
Lauren Milechin, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner,, Alexander Chen, Dylan Hutchison

TL;DR
D4M 3.0 enhances data analytics by supporting new databases, graph processing, and a Julia implementation, demonstrating improved performance and scalability for complex data tasks.
Contribution
The paper introduces D4M 3.0 with expanded database support, graph analytics capabilities, and a Julia version, advancing the toolbox's functionality and performance.
Findings
Fast SciDB ingest with D4M-SciDB connector
Graphulo enables large-scale graph algorithms
Julia implementation matches or exceeds MATLAB performance
Abstract
The D4M tool was developed to address many of today's data needs. This tool is used by hundreds of researchers to perform complex analytics on unstructured data. Over the past few years, the D4M toolbox has evolved to support connectivity with a variety of new database engines, including SciDB. D4M-Graphulo provides the ability to do graph analytics in the Apache Accumulo database. Finally, an implementation using the Julia programming language is also now available. In this article, we describe some of our latest additions to the D4M toolbox and our upcoming D4M 3.0 release. We show through benchmarking and scaling results that we can achieve fast SciDB ingest using the D4M-SciDB connector, that using Graphulo can enable graph algorithms on scales that can be memory limited, and that the Julia implementation of D4M achieves comparable performance or exceeds that of the existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
