Agentic Uncertainty Reveals Agentic Overconfidence

Jean Kaddour; Srijan Patel; Gb\`etondji Dovonon; Leo Richter; Pasquale Minervini; Matt J. Kusner

arXiv:2602.06948·cs.AI·February 9, 2026

Agentic Uncertainty Reveals Agentic Overconfidence

Jean Kaddour, Srijan Patel, Gb\`etondji Dovonon, Leo Richter, Pasquale Minervini, Matt J. Kusner

PDF

Open Access

TL;DR

This paper investigates agentic uncertainty in AI agents, revealing widespread overconfidence and showing that pre-execution assessments with less information can sometimes outperform post-execution reviews in predicting success.

Contribution

It introduces the concept of agentic uncertainty, demonstrates pervasive overconfidence in AI agents, and finds that certain assessment methods, like adversarial prompting, improve calibration.

Findings

01

Agents exhibit overconfidence, predicting success rates much higher than actual.

02

Pre-execution assessments with less information can outperform post-execution reviews.

03

Adversarial prompting as bug-finding improves calibration.

Abstract

Can AI agents predict whether they will succeed at a task? We study agentic uncertainty by eliciting success probability estimates before, during, and after task execution. All results exhibit agentic overconfidence: some agents that succeed only 22% of the time predict 77% success. Counterintuitively, pre-execution assessment with strictly less information tends to yield better discrimination than standard post-execution review, though differences are not always significant. Adversarial prompting reframing assessment as bug-finding achieves the best calibration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI