IoT Device Labeling Using Large Language Models
Bar Meyuhas, Anat Bremler-Barr, Tal Shapira

TL;DR
This paper presents a novel approach using Large Language Models to identify and label previously unseen IoT devices by leveraging network data, search enrichment, and zero-shot classification, improving device security and management.
Contribution
It introduces the first AI-driven IoT device labeling method that handles unseen devices through LLMs, search data, and catalog updates, advancing IoT security and observability.
Findings
Achieved HIT1 score of 0.7 and HIT2 score of 0.77 on 97 devices.
First research to automate IoT device labeling with AI.
Demonstrated effective zero-shot classification for device functions.
Abstract
The IoT market is diverse and characterized by a multitude of vendors that support different device functions (e.g., speaker, camera, vacuum cleaner, etc.). Within this market, IoT security and observability systems use real-time identification techniques to manage these devices effectively. Most existing IoT identification solutions employ machine learning techniques that assume the IoT device, labeled by both its vendor and function, was observed during their training phase. We tackle a key challenge in IoT labeling: how can an AI solution label an IoT device that has never been seen before and whose label is unknown? Our solution extracts textual features such as domain names and hostnames from network traffic, and then enriches these features using Google search data alongside catalog of vendors and device functions. The solution also integrates an auto-update mechanism that uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Business Process Modeling and Analysis · IoT and Edge/Fog Computing
