Fisher Information in Flow Size Distribution
Paul Tune, Darryl Veitch

TL;DR
This paper introduces Dual Sampling, a new method that closely approximates flow sampling performance with lower computational costs, improving flow size distribution estimation in network traffic analysis.
Contribution
The paper proposes Dual Sampling, a two-parameter scheme that achieves FS-like statistical performance with packet-sampling-like efficiency, supported by a Fisher information based evaluation.
Findings
Dual Sampling outperforms other packet-based methods
Implementation of Dual Sampling is feasible in routers
Theoretical analysis supported by simulations and case studies
Abstract
The flow size distribution is a useful metric for traffic modeling and management. Its estimation based on sampled data, however, is problematic. Previous work has shown that flow sampling (FS) offers enormous statistical benefits over packet sampling but high resource requirements precludes its use in routers. We present Dual Sampling (DS), a two-parameter family, which, to a large extent, provide FS-like statistical performance by approaching FS continuously, with just packet-sampling-like computational cost. Our work utilizes a Fisher information based approach recently used to evaluate a number of sampling schemes, excluding FS, for TCP flows. We revise and extend the approach to make rigorous and fair comparisons between FS, DS and others. We show how DS significantly outperforms other packet based methods, including Sample and Hold, the closest packet sampling-based competitor to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Traffic and Congestion Control · Internet Traffic Analysis and Secure E-voting · Statistical Methods and Inference
