IoT Device Identification with Machine Learning: Common Pitfalls and Best Practices
Kahraman Kostas, Rabia Yasa Kostas

TL;DR
This paper reviews machine learning-based IoT device identification, highlighting common pitfalls and offering best practices to improve model robustness, reproducibility, and generalizability in IoT security applications.
Contribution
It provides a critical analysis of existing methods, identifies key errors, and offers guidelines to address challenges in IoT device identification using machine learning.
Findings
Identifies improper data augmentation as a common error.
Highlights issues with misleading session identifiers.
Provides best practices for reproducibility and generalization.
Abstract
This paper critically examines the device identification process using machine learning, addressing common pitfalls in existing literature. We analyze the trade-offs between identification methods (unique vs. class based), data heterogeneity, feature extraction challenges, and evaluation metrics. By highlighting specific errors, such as improper data augmentation and misleading session identifiers, we provide a robust guideline for researchers to enhance the reproducibility and generalizability of IoT security models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Advanced Malware Detection Techniques · User Authentication and Security Systems
