Measuring What AI Systems Might Do: Towards A Measurement Science in AI

Konstantinos Voudouris; Mirko Thalmann; Alex Kipnis; Jos\'e Hern\'andez-Orallo; Eric Schulz

arXiv:2603.00063·cs.CY·March 3, 2026

Measuring What AI Systems Might Do: Towards A Measurement Science in AI

Konstantinos Voudouris, Mirko Thalmann, Alex Kipnis, Jos\'e Hern\'andez-Orallo, Eric Schulz

PDF

Open Access

TL;DR

This paper advocates for a scientific measurement approach to AI capabilities and propensities, emphasizing the importance of causal, dispositional properties over traditional performance metrics.

Contribution

It introduces a principled framework based on philosophy of science and measurement theory to better evaluate AI systems' dispositional properties.

Findings

01

Current evaluation methods conflate performance with dispositions.

02

A new framework for measuring AI dispositions is proposed.

03

Dominant evaluation practices often bypass causal measurement steps.

Abstract

Scientists, policy-makers, business leaders, and members of the public care about what modern artificial intelligence systems are disposed to do. Yet terms such as capabilities, propensities, skills, values, and abilities are routinely used interchangeably and conflated with observable performance, with AI evaluation practices rarely specifying what quantity they purport to measure. We argue that capabilities and propensities are dispositional properties - stable features of systems characterised by counterfactual relationships between contextual conditions and behavioural outputs. Measuring a disposition requires (i) hypothesising which contextual properties are causally relevant, (ii) independently operationalising and measuring those properties, and (iii) empirically mapping how variation in those properties affects the probability of the behaviour. Dominant approaches to AI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Innovation, Sustainability, Human-Machine Systems