Network Message Field Type Classification and Recognition for Unknown Binary Protocols
Stephan Kleber, Milan Stute, Matthias Hollick, Frank Kargl

TL;DR
This paper introduces a novel, generic approach for classifying and recognizing data types in message fields of unknown binary network protocols, aiding reverse engineering and security analysis.
Contribution
It presents the first method to analyze unknown binary protocol messages by segmenting and clustering byte sequences to identify data types, improving accuracy over existing approaches.
Findings
Achieves up to 100% data-type recognition precision
Improves classification accuracy by a factor of 1.3 to 3.7
Provides an open-source implementation for community use
Abstract
Reverse engineering of unknown network protocols based on recorded traffic traces enables security analyses and debugging of undocumented network services. In particular for binary protocols, existing approaches (1) lack comprehensive methods to classify or determine the data type of a discovered segment in a message, e.,g., a number, timestamp, or network address, that would allow for a semantic interpretation and (2) have strong assumptions that prevent analysis of lower-layer protocols often found in IoT or mobile systems. In this paper, we propose the first generic method for analyzing unknown messages from binary protocols to reveal the data types in message fields. To this end, we split messages into segments of bytes and use their vector interpretation to calculate similarities. These can be used to create clusters of segments with the same type and, moreover, to recognize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
