Audited calibration under regime shift as a computational test of support-structured broadcast
Mark Walsh

TL;DR
This paper tests how metacognitive calibration varies with support structure in a probabilistic task under regime shifts, showing that an auditor architecture improves calibration and control behavior compared to content-dominated models.
Contribution
It introduces a computational test distinguishing support-structured calibration from content-based models, demonstrating the benefits of an auditor architecture in dynamic regimes.
Findings
Auditor architecture improves calibration in degraded regimes.
Auditor selectively requests additional evidence under low-support conditions.
System-level confidence can dissociate from content performance.
Abstract
A central prediction of the accompanying theoretical framework is that metacognitive calibration can vary even when content-level performance is held approximately fixed, depending on whether support structure is preserved in a globally reusable broadcast state. We provide a minimal computational test of this claim using a two-channel probabilistic cue-integration task with regime shifts that induce systematic miscalibration in one channel. We compare content-dominated architectures, in which confidence is calibrated by a single global mapping from evidence strength to probability, to an auditor architecture that learns a regime-conditioned calibration mapping from an audit trail of outcomes. We then couple confidence to control by implementing a policy that either acts immediately or requests one additional sample when confidence falls below a threshold. Across matched evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural and Behavioral Psychology Studies · Auditing, Earnings Management, Governance · Image and Video Quality Assessment
