Parallax: Why AI Agents That Think Must Never Act

Joel Fokou

arXiv:2604.12986·cs.CR·April 15, 2026

Parallax: Why AI Agents That Think Must Never Act

Joel Fokou

PDF

1 Repo

TL;DR

Parallax introduces an architectural framework for safe autonomous AI execution that prevents harmful actions by structurally separating reasoning from execution and incorporating multi-tiered validation and rollback mechanisms.

Contribution

It proposes a novel architectural paradigm for AI safety that surpasses prompt-based guardrails, including open-source implementation and comprehensive adversarial evaluation.

Findings

01

Blocks 98.9% of attacks in tests with default settings

02

Achieves 100% attack blocking under maximum-security configuration

03

Architectural boundary remains effective even when reasoning system is compromised

Abstract

Autonomous AI agents are rapidly transitioning from experimental tools to operational infrastructure, with projections that 80% of enterprise applications will embed AI copilots by the end of 2026. As agents gain the ability to execute real-world actions (reading files, running commands, making network requests, modifying databases), a fundamental security gap has emerged. The dominant approach to agent safety relies on prompt-level guardrails: natural language instructions that operate at the same abstraction level as the threats they attempt to mitigate. This paper argues that prompt-based safety is architecturally insufficient for agents with execution capability and introduces Parallax, a paradigm for safe autonomous AI execution grounded in four principles: Cognitive-Executive Separation, which structurally prevents the reasoning system from executing actions; Adversarial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.