"A Good Bot Always Knows Its Limitations": Assessing Autonomous System   Decision-making Competencies through Factorized Machine Self-confidence

Brett W. Israelsen; Nisar R. Ahmed; Matthew Aitken; Eric W. Frew; Dale; A. Lawrence; Brian M. Argrow

arXiv:2407.19631·cs.AI·April 16, 2025

"A Good Bot Always Knows Its Limitations": Assessing Autonomous System Decision-making Competencies through Factorized Machine Self-confidence

Brett W. Israelsen, Nisar R. Ahmed, Matthew Aitken, Eric W. Frew, Dale, A. Lawrence, Brian M. Argrow

PDF

Open Access 1 Repo

TL;DR

This paper introduces the FaMSeC framework, a comprehensive method for autonomous systems to assess their decision-making competency using self-confidence indicators derived from probabilistic and statistical analysis.

Contribution

The paper presents the novel FaMSeC framework that integrates multiple factors to evaluate autonomous system competency through meta-reasoning and probabilistic statistics.

Findings

01

FaMSeC indicators effectively assess system competency.

02

Outcome and solver quality factors can be derived for various tasks.

03

Numerical evaluations confirm the indicators' performance.

Abstract

How can intelligent machines assess their competency to complete a task? This question has come into focus for autonomous systems that algorithmically make decisions under uncertainty. We argue that machine self-confidence -- a form of meta-reasoning based on self-assessments of system knowledge about the state of the world, itself, and ability to reason about and execute tasks -- leads to many computable and useful competency indicators for such agents. This paper presents our body of work, so far, on this concept in the form of the Factorized Machine Self-confidence (FaMSeC) framework, which holistically considers several major factors driving competency in algorithmic decision-making: outcome assessment, solver quality, model quality, alignment quality, and past experience. In FaMSeC, self-confidence indicators are derived via 'problem-solving statistics' embedded in Markov decision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

COHRINT/FaMSeC
mxnetOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Safety Systems Engineering in Autonomy · Software Reliability and Analysis Research

MethodsFocus