Architectural Obsolescence of Unhardened Agentic-AI Runtimes
Alfredo Metere

TL;DR
This paper demonstrates that current unhardened agentic-AI runtimes are fundamentally insecure due to structural deficiencies, and proposes a fortified architecture that achieves perfect detection of malicious divergences.
Contribution
It introduces a new architecture with seven key structures that significantly improve security and detection capabilities over existing runtimes.
Findings
OpenClaw detects none of the divergence types in tests.
Enclawed-oss achieves perfect detection (accuracy 1.0) on all divergence types.
Structural modifications improve detection in real-world plugin channels.
Abstract
An agentic-AI runtime issues tool calls, sends messages, and actuates devices on behalf of an LLM. Catching the four ways an action can diverge from its audit record -- F1 gate-bypass, F2 audit-forgery, silent host failure, F4 wrong-target, -- is a load-bearing safety property of any such runtime. We show that upstream OpenClaw, the most engineered single-user agentic-AI gateway in public release, catches none of them: recall is 0.000 on every cell of every confusion matrix, on a 1600-sample template baseline through OpenClaw's actual production command-line interface (CLI) and on a ten-LLM cross-model generalisation run. Detecting F1--F4 requires seven specific runtime structures absent from OpenClaw's source tree: a biconditional checker, a hash-chained audit log, an extension admission gate, a two-layer egress guard, a Bell-LaPadula classification policy, a module-signing trust root,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
