ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems

Florin Adrian Chitan

arXiv:2603.13247·cs.AI·March 17, 2026

ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems

Florin Adrian Chitan

PDF

Open Access

TL;DR

ILION is a deterministic, interpretable safety gate for autonomous AI agents that classifies actions as safe or unsafe in real-time, outperforming existing moderation tools in accuracy and speed without requiring labeled data.

Contribution

The paper introduces ILION, a novel deterministic safety system for agentic AI, capable of rapid, interpretable decision-making without training data, addressing a critical safety gap.

Findings

01

ILION achieves high F1 score of 0.8515 and low false positive rate of 7.9%.

02

ILION operates with sub-millisecond latency, significantly faster than baselines.

03

Existing text moderation tools fail on agent safety tasks due to task mismatch.

Abstract

The proliferation of autonomous AI agents capable of executing real-world actions - filesystem operations, API calls, database modifications, financial transactions - introduces a class of safety risk not addressed by existing content-moderation infrastructure. Current text-safety systems evaluate linguistic content for harm categories such as violence, hate speech, and sexual content; they are architecturally unsuitable for evaluating whether a proposed action falls within an agent's authorized operational scope. We present ILION (Intelligent Logic Identity Operations Network), a deterministic execution gate for agentic AI systems. ILION employs a five-component cascade architecture - Transient Identity Imprint (TII), Semantic Vector Reference Frame (SVRF), Identity Drift Control (IDC), Identity Resonance Score (IRS) and Consensus Veto Layer (CVL) - to classify proposed agent actions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Hate Speech and Cyberbullying Detection