Finding Clustering Configurations to Accurately Infer Packet Structures from Network Data
Othman Esoul, Neil Walkinshaw

TL;DR
This paper investigates how different parameters affect clustering accuracy in reverse engineering network protocols from traces, identifying key factors like distance measure and message length that influence performance.
Contribution
It systematically evaluates four parameters across multiple protocols to determine optimal configurations for accurate packet structure inference.
Findings
Distance measure choice significantly impacts clustering accuracy
Message length has a substantial effect on results
Protocol type influences optimal parameter settings
Abstract
Clustering is often used for reverse engineering network protocols from captured network traces. The performance of clustering techniques is often contingent upon the selection of various parameters, which can have a severe impact on clustering quality. In this paper we experimentally investigate the effect of four different parameters with respect to network traces. We also determining the optimal parameter configuration with respect to traces from four different network protocols. Our results indicate that the choice of distance measure and the length of the message has the most substantial impact on cluster accuracy. Depending on the type of protocol, the -gram length can also have a substantial impact.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Network Packet Processing and Optimization · Internet Traffic Analysis and Secure E-voting
