IoT Data Trust Evaluation via Machine Learning
Timothy Tadj, Reza Arablouei, Volkan Dedeoglu

TL;DR
This paper introduces a novel data synthesis method called RWI to generate labeled IoT datasets for trust evaluation, proposes new correlation-based features, and demonstrates improved ML model performance over existing methods.
Contribution
It presents a new data augmentation technique, RWI, and correlation-based features, enhancing ML-based IoT data trust evaluation and addressing the lack of benchmark datasets.
Findings
RWI-generated datasets improve ML model generalization.
Correlation features outperform traditional features.
Semi-supervised approach achieves competitive results with minimal labels.
Abstract
Various approaches based on supervised or unsupervised machine learning (ML) have been proposed for evaluating IoT data trust. However, assessing their real-world efficacy is hard mainly due to the lack of related publicly-available datasets that can be used for benchmarking. Since obtaining such datasets is challenging, we propose a data synthesis method, called random walk infilling (RWI), to augment IoT time-series datasets by synthesizing untrustworthy data from existing trustworthy data. Thus, RWI enables us to create labeled datasets that can be used to develop and validate ML models for IoT data trust evaluation. We also extract new features from IoT time-series sensor data that effectively capture its auto-correlation as well as its cross-correlation with the data of the neighboring (peer) sensors. These features can be used to learn ML models for recognizing the trustworthiness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques · Internet Traffic Analysis and Secure E-voting
