On the Statistical Characterization of Flows in Internet Traffic with Application to Sampling
Yousra Chabchoub (INRIA), Christine Fricker (INRIA), Fabrice Guillemin, (FT R&D), Philippe Robert (INRIA)

TL;DR
This paper introduces a statistical method for analyzing TCP flows in sampled Internet traffic, enabling accurate estimation of large flow characteristics and counts using new observables and a mathematical framework.
Contribution
It develops a novel set of observables and a mathematical framework for reliable statistical analysis of TCP flows from sampled data, including large flow estimation.
Findings
The method accurately estimates the number of large TCP flows from sampled data.
Experimental validation shows the approach works across different IP network types.
The framework quantifies the estimation accuracy and reliability.
Abstract
A new method of estimating some statistical characteristics of TCP flows in the Internet is developed in this paper. For this purpose, a new set of random variables (referred to as observables) is defined. When dealing with sampled traffic, these observables can easily be computed from sampled data. By adopting a convenient mouse/elephant dichotomy also dependent on traffic, it is shown how these variables give a reliable statistical representation of the number of packets transmitted by large flows during successive time intervals with an appropriate duration. A mathematical framework is developed to estimate the accuracy of the method. As an application, it is shown how one can estimate the number of large TCP flows when only sampled traffic is available. The algorithm proposed is tested against experimental data collected from different types of IP networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Traffic and Congestion Control · Advanced Queuing Theory Analysis · Statistical Methods and Inference
