Invisible failures in human-AI interactions

Christopher Potts; Moritz Sudhof

arXiv:2603.15423·cs.CL·May 13, 2026

Invisible failures in human-AI interactions

Christopher Potts, Moritz Sudhof

PDF

1 Repo

TL;DR

This paper analyzes silent AI failures in human-AI interactions, categorizing them into archetypes, and demonstrates their relevance across different AI capabilities and domains.

Contribution

It introduces a taxonomy of invisible AI failures, analyzes their patterns, and provides a dataset and code for failure monitoring in AI systems.

Findings

01

79% of AI failures are invisible to users

02

Failure rates have decreased with newer models but invisibility persists

03

Failure archetypes remain stable across AI capabilities

Abstract

AI systems fail silently far more often than they fail visibly. In an analysis of 100K human-AI interactions from the WildChat dataset, we find that 79% of AI failures are invisible: something went wrong but the user gave no overt indication that there was a problem. These invisible failures cluster into eight archetypes that help us characterize where and how AI systems are failing to meet users' needs. In addition, the archetypes show systematic co-occurrence patterns indicating higher-level failure types. To address the question of whether these archetypes will remain relevant as AI systems become more capable, we also created and annotated a counterfactual dataset in which WildChat's 2024-era responses are replaced by those from three present-day frontier LMs. This analysis indicates that failure rates have dropped substantially, but that the vast majority of failures remain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bigspinai/bigspin-invisible-failure-archetypes
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.