# Improved temporal IoT device identification using robust statistical features

**Authors:** Nik Aqil, Faiz Zaki, Firdaus Afifi, Hazim Hanif, Miss Laiha Mat Kiah, Nor Badrul Anuar

PMC · DOI: 10.7717/peerj-cs.2145 · PeerJ Computer Science · 2024-07-09

## TL;DR

This paper introduces a new method for identifying IoT devices using stable statistical features, improving accuracy and reducing the need for frequent retraining.

## Contribution

The novel contribution is the use of robust statistical features based on payload lengths to enhance IoT device identification.

## Key findings

- The proposed feature set achieved over 80% accuracy across all weeks on the IoT Traffic Traces dataset.
- The method improved accuracy by +10.13% on the IoT-FSCIT dataset compared to benchmark studies.

## Abstract

The Internet of Things (IoT) is becoming more prevalent in our daily lives. A recent industry report projected the global IoT market to be worth more than USD 4 trillion by 2032. To cope with the ever-increasing IoT devices in use, identifying and securing IoT devices has become highly crucial for network administrators. In that regard, network traffic classification offers a promising solution by precisely identifying IoT devices to enhance network visibility, allowing better network security. Currently, most IoT device identification solutions revolve around machine learning, outperforming prior solutions like port and behavioural-based. Although performant, these solutions often experience performance degradation over time due to statistical changes in the data. As a result, they require frequent retraining, which is computationally expensive. Therefore, this article aims to improve the model performance through a robust alternative feature set. The improved feature set leverages payload lengths to model the unique characteristics of IoT devices and remains stable over time. Besides that, this article utilizes the proposed feature set with Random Forest and OneVSRest to optimize the learning process, particularly concerning the easier addition of new IoT devices. On the other hand, this article introduces weekly dataset segmentation to ensure fair evaluation over different time frames. Evaluation on two datasets, a public dataset, IoT Traffic Traces, and a self-collected dataset, IoT-FSCIT, show that the proposed feature set maintained above 80% accuracy throughout all weeks on the IoT Traffic Traces dataset, outperforming selected benchmark studies while improving accuracy over time by +10.13% on the IoT-FSCIT dataset.

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382), IoT (MESH:C000719207)
- **Chemicals:** Alexa (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11323103/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11323103/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/PMC11323103/full.md

---
Source: https://tomesphere.com/paper/PMC11323103