SMoRFFI: A Large-Scale Same-Model 2.4 GHz Wi-Fi Dataset and Reproducible Framework for RF Fingerprinting
Zewei Guo, Zhen Jia, JinXiao Zhu, Wenhao Huang, Yin Chen

TL;DR
This paper introduces a large-scale RF fingerprinting dataset of 123 identical Wi-Fi devices and an open-source framework, enabling robust device identification and reproducible research in RF signal analysis.
Contribution
It provides the first large-scale same-model Wi-Fi dataset with an open-source framework, facilitating improved RF fingerprinting research and evaluation.
Findings
Achieved 89.06% identification accuracy with Random Forest baseline.
Collected 35.42 million raw I/Q samples and 1.85 million RF features.
Established a reproducible pipeline from data collection to evaluation.
Abstract
Radio frequency (RF) fingerprinting exploits hardware imperfections for device identification, but distinguishing between same-model devices remains challenging due to their minimal hardware variations. Existing datasets for RF fingerprinting are constrained by small device scales and heterogeneous models, which hinder robust training and fair evaluation of machine learning methods. To address this gap, we introduce a large-scale dataset of same-model devices along with an open-source experimental framework. The dataset is built using 123 same-model commercial IEEE 802.11g devices, which contain 35.42 million raw I/Q samples from the preambles and corresponding 1.85 million RF features. The accompanying framework further provides a fully reproducible pipeline from data collection to performance evaluation. Within this framework, a Random Forest-based algorithm is implemented as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Signal Modulation Classification · Internet Traffic Analysis and Secure E-voting · Indoor and Outdoor Localization Technologies
