Know When to Trust the Skill: Delayed Appraisal and Epistemic Vigilance for Single-Agent LLMs

Eren Unlu

arXiv:2604.16753·cs.AI·April 21, 2026

Know When to Trust the Skill: Delayed Appraisal and Epistemic Vigilance for Single-Agent LLMs

Eren Unlu

PDF

TL;DR

This paper introduces MESA-S, a framework for enhancing trust and epistemic vigilance in single-agent LLMs through delayed appraisal and confidence separation.

Contribution

It formalizes a metacognitive architecture that separates self-confidence from source-confidence and incorporates delayed evaluation to improve reliability.

Findings

01

Explicit trust provenance reduces vulnerabilities.

02

Delayed escalation prunes unnecessary reasoning.

03

Decoupling confidence prevents inflation and improves trustworthiness.

Abstract

As large language models (LLMs) transition into autonomous agents integrated with extensive tool ecosystems, traditional routing heuristics increasingly succumb to context pollution and "overthinking". We argue that the bottleneck is not a deficit in algorithmic capability or skill diversity, but the absence of disciplined second-order metacognitive governance. In this paper, our scientific contribution focuses on the computational translation of human cognitive control - specifically, delayed appraisal, epistemic vigilance, and region-of-proximal offloading - into a single-agent architecture. We introduce MESA-S (Metacognitive Skills for Agents, Single-agent), a preliminary framework that shifts scalar confidence estimation into a vector separating self-confidence (parametric certainty) from source-confidence (trust in retrieved external procedures). By formalizing a delayed procedural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.