Zero-CPU Collection with Direct Telemetry Access
Jonatan Langlet, Ran Ben Basat, Sivaramakrishnan Ramanathan, Gabriele, Oliaro, Michael Mitzenmacher, Minlan Yu, Gianni Antichi

TL;DR
This paper introduces DART, a novel approach enabling switches to directly insert telemetry data into collectors' memory, significantly reducing CPU load and improving scalability in network telemetry collection.
Contribution
DART is a probabilistic method allowing switches to write telemetry data directly into collector memory without coordination, enhancing scalability and reducing CPU overhead.
Findings
Achieves 99.9% query success rate with minimal CPU involvement.
Uses approximately 300 bytes per flow for telemetry data.
Prototypes on commodity hardware demonstrate high effectiveness.
Abstract
Programmable switches are driving a massive increase in fine-grained measurements. This puts significant pressure on telemetry collectors that have to process reports from many switches. Past research acknowledged this problem by either improving collectors' stack performance or by limiting the amount of data sent from switches. In this paper, we take a different and radical approach: switches are responsible for directly inserting queryable telemetry data into the collectors' memory, bypassing their CPU, and thereby improving their collection scalability. We propose to use a method we call \emph{direct telemetry access}, where switches jointly write telemetry reports directly into the same collector's memory region, without coordination. Our solution, DART, is probabilistic, trading memory redundancy and query success probability for CPU resources at collectors. We prototype DART using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
