On the Measure of a Model: From Intelligence to Generality

Ruchira Dhar; Ninell Oldenburg; Anders Soegaard

arXiv:2511.11773·cs.AI·November 18, 2025

On the Measure of a Model: From Intelligence to Generality

Ruchira Dhar, Ninell Oldenburg, Anders Soegaard

PDF

Open Access

TL;DR

This paper argues that evaluating AI models should focus on their generality, a measurable and stable trait, rather than abstract notions of intelligence, to better reflect their real-world utility across diverse tasks.

Contribution

The paper provides a conceptual and formal analysis showing that generality, not intelligence, is the key stable metric for evaluating AI models across tasks.

Findings

01

Generality is more stable and empirically supported than intelligence.

02

Evaluation should be based on measurable performance breadth.

03

Generality aligns with multitask learning principles.

Abstract

Benchmarks such as ARC, Raven-inspired tests, and the Blackbird Task are widely used to evaluate the intelligence of large language models (LLMs). Yet, the concept of intelligence remains elusive- lacking a stable definition and failing to predict performance on practical tasks such as question answering, summarization, or coding. Optimizing for such benchmarks risks misaligning evaluation with real-world utility. Our perspective is that evaluation should be grounded in generality rather than abstract notions of intelligence. We identify three assumptions that often underpin intelligence-focused evaluation: generality, stability, and realism. Through conceptual and formal analysis, we show that only generality withstands conceptual and empirical scrutiny. Intelligence is not what enables generality; generality is best understood as a multitask learning problem that directly links…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications