The Cognitive Circuit Breaker: A Systems Engineering Framework for Intrinsic AI Reliability

Jonathan Pan

arXiv:2604.13417·cs.SE·April 16, 2026

The Cognitive Circuit Breaker: A Systems Engineering Framework for Intrinsic AI Reliability

Jonathan Pan

PDF

TL;DR

The paper introduces the Cognitive Circuit Breaker, a systems engineering framework for intrinsic AI reliability that detects hallucinations in LLMs with minimal latency by analyzing internal model states.

Contribution

It proposes a novel intrinsic reliability monitoring method using internal states to detect hallucinations, reducing reliance on external post-generation checks.

Findings

01

Significant detection of cognitive dissonance correlates with hallucinations.

02

Framework generalizes across different architectures and OOD data.

03

Adds negligible computational overhead to inference pipeline.

Abstract

As Large Language Models (LLMs) are increasingly deployed in mission-critical software systems, detecting hallucinations and ``faked truthfulness'' has become a paramount engineering challenge. Current reliability architectures rely heavily on post-generation, black-box mechanisms, such as Retrieval-Augmented Generation (RAG) cross-checking or LLM-as-a-judge evaluators. These extrinsic methods introduce unacceptable latency, high computational overhead, and reliance on secondary external API calls, frequently violating standard software engineering Service Level Agreements (SLAs). In this paper, we propose the Cognitive Circuit Breaker, a novel systems engineering framework that provides intrinsic reliability monitoring with minimal latency overhead. By extracting hidden states during a model's forward pass, we calculate the ``Cognitive Dissonance Delta'' -- the mathematical gap between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.