Large-Scale Classification of IPv6-IPv4 Siblings with Variable Clock Skew
Quirin Scheitle, Oliver Gasser, Minoo Rouhi, Georg Carle

TL;DR
This paper presents a method for accurately classifying IPv6 and IPv4 address pairs as siblings using active measurements of TCP timestamps and variable clock skew, achieving over 99% precision and scalable to large datasets.
Contribution
The study introduces a novel approach leveraging variable clock skew features and machine learning to classify IPv6-IPv4 sibling pairs with high accuracy and scalability.
Findings
Models exceed 99% precision in classification
Method scales to large datasets of 149,000 siblings
Features include estimation of variable remote clock skew
Abstract
Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
