Cracking IoT Security: Can LLMs Outsmart Static Analysis Tools?

Jason Quantrill; Noura Khajehnouri; Zihan Guo; Manar H. Alalfi

arXiv:2601.00559·cs.CR·January 5, 2026

Cracking IoT Security: Can LLMs Outsmart Static Analysis Tools?

Jason Quantrill, Noura Khajehnouri, Zihan Guo, Manar H. Alalfi

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of Large Language Models in detecting interaction threats in IoT smart home rules, comparing their performance to traditional symbolic analysis methods and highlighting their current limitations.

Contribution

It provides the first comprehensive benchmarking of LLMs for IoT interaction threat detection, revealing their strengths and weaknesses relative to symbolic static analysis.

Findings

01

LLMs perform well on semantic understanding of threats.

02

LLMs struggle with structural reasoning in mutated rule scenarios.

03

Symbolic analysis remains more reliable across various rule transformations.

Abstract

Smart home IoT platforms such as openHAB rely on Trigger Action Condition (TAC) rules to automate device behavior, but the interplay among these rules can give rise to interaction threats, unintended or unsafe behaviors emerging from implicit dependencies, conflicting triggers, or overlapping conditions. Identifying these threats requires semantic understanding and structural reasoning that traditionally depend on symbolic, constraint-driven static analysis. This work presents the first comprehensive evaluation of Large Language Models (LLMs) across a multi-category interaction threat taxonomy, assessing their performance on both the original openHAB (oHC/IoTB) dataset and a structurally challenging Mutation dataset designed to test robustness under rule transformations. We benchmark Llama 3.1 8B, Llama 70B, GPT-4o, Gemini-2.5-Pro, and DeepSeek-R1 across zero-, one-, and two-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing