Capabilities Ain't All You Need: Measuring Propensities in AI
Daniel Romero-Alvarado, Fernando Mart\'inez-Plumed, Lorenzo Pacchiardi, Hugo Save, Siddhesh Milind Pawar, Behzad Mehrbakhsh, Pablo Antonio Moreno Casares, Ben Slater, Paolo Bova, Peter Romero, Zachary R. Tidler, Jonathan Prunty, Luning Sun, Jose Hernandez-Orallo

TL;DR
This paper introduces a formal framework for measuring AI propensities, capturing tendencies to exhibit specific behaviors, which enhances prediction of AI performance and safety beyond traditional capability assessments.
Contribution
It presents the first formal bilogistic model for quantifying AI propensities and demonstrates its effectiveness in predicting model behavior across tasks.
Findings
Propensities can be accurately measured using the proposed framework.
Propensity estimates from one benchmark predict behavior on new tasks.
Combining propensities with capabilities improves predictive accuracy.
Abstract
AI evaluation has primarily focused on measuring capabilities, with formal approaches inspired from Item Response Theory (IRT) being increasingly applied. Yet propensities - the tendencies of models to exhibit particular behaviours - play a central role in determining both performance and safety outcomes. However, traditional IRT describes a model's success on a task as a monotonic function of model capabilities and task demands, an approach unsuited to propensities, where both excess and deficiency can be problematic. Here, we introduce the first formal framework for measuring AI propensities by using a bilogistic formulation for model success, which attributes high success probability when the model's propensity is within an "ideal band". Further, we estimate the limits of the ideal band using LLMs equipped with newly developed task-agnostic rubrics. Applying our framework to six…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
