A Multi-Step Comparative Framework for Anomaly Detection in IoT Data Streams
Mohammed Al-Qudah, Fadi AlMahamid

TL;DR
This paper introduces a multi-step evaluation framework that systematically analyzes how different preprocessing steps affect the performance of various machine learning models for anomaly detection in IoT data streams.
Contribution
It provides a structured approach to assess the impact of preprocessing choices on multiple ML algorithms, offering guidance for improved IoT anomaly detection.
Findings
GBoosting outperforms other models in accuracy across preprocessing configurations.
RNN-LSTM benefits from z-score normalization.
Autoencoders achieve high recall, suitable for unsupervised detection.
Abstract
The rapid expansion of Internet of Things (IoT) devices has introduced critical security challenges, underscoring the need for accurate anomaly detection. Although numerous studies have proposed machine learning (ML) methods for this purpose, limited research systematically examines how different preprocessing steps--normalization, transformation, and feature selection--interact with distinct model architectures. To address this gap, this paper presents a multi-step evaluation framework assessing the combined impact of preprocessing choices on three ML algorithms: RNN-LSTM, autoencoder neural networks (ANN), and Gradient Boosting (GBoosting). Experiments on the IoTID20 dataset shows that GBoosting consistently delivers superior accuracy across preprocessing configurations, while RNN-LSTM shows notable gains with z-score normalization and autoencoders excel in recall, making them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
