Hyperscaling Internet Graph Analysis with D4M on the MIT SuperCloud
Vijay Gadepally, Jeremy Kepner, Lauren Milechin, William Arcand, David, Bestor, Bill Bergeron, Chansup Byun, Matthew Hubbell, Micheal Houle, Micheal, Jones, Peter Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Charles, Yee, Siddharth Samsi, Albert Reuther

TL;DR
This paper demonstrates how D4M combined with MIT SuperCloud enables rapid, scalable analysis of massive network traffic data, significantly improving processing speed for anomaly detection and network analytics.
Contribution
It introduces a scalable, high-level programming approach using D4M on SuperCloud for efficient network traffic analytics at scale.
Findings
Achieved over 20,000x speedup in processing 96 hours of PCAP data.
Implemented a complete analytics pipeline in only 135 lines of code.
Enabled interactive analysis of large-scale network data within minutes.
Abstract
Detecting anomalous behavior in network traffic is a major challenge due to the volume and velocity of network traffic. For example, a 10 Gigabit Ethernet connection can generate over 50 MB/s of packet headers. For global network providers, this challenge can be amplified by many orders of magnitude. Development of novel computer network traffic analytics requires: high level programming environments, massive amount of packet capture (PCAP) data, and diverse data products for "at scale" algorithm pipeline development. D4M (Dynamic Distributed Dimensional Data Model) combines the power of sparse linear algebra, associative arrays, parallel processing, and distributed databases (such as SciDB and Apache Accumulo) to provide a scalable data and computation system that addresses the big data problems associated with network analytics development. Combining D4M with the MIT SuperCloud…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Interconnection Networks and Systems · Network Packet Processing and Optimization
