Taming Silent Failures: A Framework for Verifiable AI Reliability

Guan-Yan Yang; Farn Wang

arXiv:2510.22224·cs.SE·March 3, 2026

Taming Silent Failures: A Framework for Verifiable AI Reliability

Guan-Yan Yang, Farn Wang

PDF

TL;DR

This paper presents FAME, a framework combining formal synthesis and runtime monitoring to detect silent AI failures in safety-critical systems, demonstrated on autonomous vehicle perception with high detection rates.

Contribution

Introduces FAME, a novel framework that integrates formal verification and runtime monitoring to ensure AI reliability and safety in critical applications.

Findings

01

FAME detected 93.5% of critical safety violations in autonomous vehicles.

02

Framework aligns with ISO safety standards for practical deployment.

03

Proves the feasibility of certifiable, trustworthy AI in safety-critical systems.

Abstract

The integration of Artificial Intelligence (AI) into safety-critical systems introduces a new reliability paradigm: silent failures, where AI produces confident but incorrect outputs that can be dangerous. This paper introduces the Formal Assurance and Monitoring Environment (FAME), a novel framework that confronts this challenge. FAME synergizes the mathematical rigor of offline formal synthesis with the vigilance of online runtime monitoring to create a verifiable safety net around opaque AI components. We demonstrate its efficacy in an autonomous vehicle perception system, where FAME successfully detected 93.5% of critical safety violations that were otherwise silent. By contextualizing our framework within the ISO 26262 and ISO/PAS 8800 standards, we provide reliability engineers with a practical, certifiable pathway for deploying trustworthy AI. FAME represents a crucial shift from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.