Quantifying the Expectation-Realisation Gap for Agentic AI Systems
Sebastian Lobentanzer

TL;DR
This paper systematically quantifies the gap between expectations and actual outcomes of agentic AI systems across various domains, revealing consistent overestimations and highlighting the need for structured benefit planning.
Contribution
It provides empirical evidence of the expectation-realisation gap in agentic AI, emphasizing the importance of explicit, quantified benefit expectations with human oversight considerations.
Findings
Developers overestimated AI speedup by 43 percentage points in software engineering.
Clinical tools showed less than one minute time savings, contrary to claims.
External validation performance is significantly below developer-reported metrics.
Abstract
Agentic AI systems are deployed with expectations of substantial productivity gains, yet rigorous empirical evidence reveals systematic discrepancies between pre-deployment expectations and post-deployment outcomes. We review controlled trials and independent validations across software engineering, clinical documentation, and clinical decision support to quantify this expectation-realisation gap. In software development, experienced developers expected a 24% speedup from AI tools but were slowed by 19% -- a 43 percentage-point calibration error. In clinical documentation, vendor claims of multi-minute time savings contrast with measured reductions of less than one minute per note, and one widely deployed tool showed no statistically significant effect. In clinical decision support, externally validated performance falls substantially below developer-reported metrics. These shortfalls…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Healthcare Technology and Patient Monitoring · Scientific Computing and Data Management
