# From Edge Transformer to IoT Decisions: Offloaded Embeddings for Lightweight Intrusion Detection

**Authors:** Frédéric Adjewa, Moez Esseghir, Leïla Merghem-Boulahia

PMC · DOI: 10.3390/s26020356 · Sensors (Basel, Switzerland) · 2026-01-06

## TL;DR

This paper introduces SEED, a lightweight intrusion detection system for IoT devices that uses offloaded embeddings from a compact BERT model to achieve high accuracy and fast performance.

## Contribution

The novel SEED system proposes collaborative embeddings offloading for lightweight intrusion detection in IoT environments.

## Key findings

- The optimized BERT model is reduced to 41 MB, suitable for edge deployment.
- The compact neural network on IoT devices is only 137 KB and achieves 99.9% detection accuracy.
- The system maintains high performance with an average inference time of under 70 ms on a standard CPU.

## Abstract

The convergence of Artificial Intelligence (AI) and the Internet of Things (IoT) is enabling a new class of intelligent applications. Specifically, Large Language Models (LLMs) are emerging as powerful tools not only for natural language understanding but also for enhancing IoT security. However, the integration of these computationally intensive models into resource-constrained IoT environments presents significant challenges. This paper provides an in-depth examination of how LLMs can be adapted to secure IoT ecosystems. We identify key application areas, discuss major challenges, and propose optimization strategies for resource-limited settings. Our primary contribution is a novel collaborative embeddings offloading mechanism for IoT intrusion detection named SEED (Semantic Embeddings for Efficient Detection). This system leverages a lightweight, fine-tuned BERT model, chosen for its proven contextual and semantic understanding of sequences, to generate rich network embeddings at the edge. A compact neural network deployed on the end-device then queries these embeddings to assess network flow normality. This architecture alleviates the computational burden of running a full transformer on the device while capitalizing on its analytical performance. Our optimized BERT model is reduced by approximately 90% from its original size, now representing approximately 41 MB, suitable for the Edge. The resulting compact neural network is a mere 137 KB, appropriate for the IoT devices. This system achieves 99.9% detection accuracy with an average inference time of under 70 ms on a standard CPU. Finally, the paper discusses the ethical implications of LLM-IoT integration and evaluates the resilience of LLMs in dynamic and adversarial environments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12845598/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12845598/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12845598/full.md

---
Source: https://tomesphere.com/paper/PMC12845598