Profiling Multi-Level Operator Costs for Bottleneck Diagnosis in High-Speed Data Planes
Zhiyuan Ren, Yutao Liu, Wenchi Cheng, Kun Yang

TL;DR
This paper introduces a novel methodology for profiling operator costs in high-speed data planes, enabling precise bottleneck diagnosis and revealing architecture-dependent performance behaviors.
Contribution
It presents a saturation throughput delta-based measurement approach and the Operator Performance Quadrant framework for classifying operator costs across architectures.
Findings
CRC operators exhibit super-linear scaling behavior.
Most operators show sub-linear scaling.
Cross-architecture differences in operator performance are significant.
Abstract
This paper proposes a saturation throughput delta-based methodology to precisely measure operator costs in high-speed data planes without intrusive instrumentation. The approach captures non-linear scaling, revealing that compute-intensive operators like CRC exhibit super-linear behavior, while most others are sub-linear. We introduce the Operator Performance Quadrant (OPQ) framework to classify operators by base and scaling costs, exposing a cross-architecture Quadrant Shift between Arm and x86. This method provides accurate, architecture-aware bottleneck diagnosis and a realistic basis for performance modeling and optimization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
