Temporal Correlation of Internet Observatories and Outposts
Jeremy Kepner, Michael Jones, Daniel Andersen, Ayd{\i}n Bulu\c{c},, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays,, David Bestor, William Bergeron, Vijay Gadepally, Daniel Grant, Micheal Houle,, Matthew Hubbell, Hayden Jananthan, Anna Klein

TL;DR
This paper analyzes the correlation of unsolicited Internet traffic sources observed from different observatories, revealing persistent high-frequency sources and their temporal dynamics over months using advanced matrix technologies.
Contribution
It introduces a novel analysis of Internet traffic correlations between observatories using GraphBLAS and D4M technologies, and characterizes the temporal behavior of high-frequency sources.
Findings
70% of bright sources are consistently detected across observatories within 6 months.
The distribution of sources follows a Zipf-Mandelbrot distribution.
Temporal correlations fit a modified Cauchy distribution, indicating drifting high-frequency sources.
Abstract
The Internet has become a critical component of modern civilization requiring scientific exploration akin to endeavors to understand the land, sea, air, and space environments. Understanding the baseline statistical distributions of traffic are essential to the scientific understanding of the Internet. Correlating data from different Internet observatories and outposts can be a useful tool for gaining insights into these distributions. This work compares observed sources from the largest Internet telescope (the CAIDA darknet telescope) with those from a commercial outpost (the GreyNoise honeyfarm). Neither of these locations actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Newly developed GraphBLAS hyperspace matrices and D4M associative array technologies enable the efficient analysis of these data on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
