
TL;DR
This paper applies psychometric analysis to AI models, revealing a strong positive manifold of general intelligence that evolves over time and is influenced by specialized reasoning tools.
Contribution
It connects psychometric G-factor analysis to AI benchmark performance, showing how general intelligence in models changes with new architectures and tools.
Findings
Strong positive correlations across AI benchmarks indicate a G-factor.
The G-factor explains up to 92% of variance in model performance.
Specialized reasoning models cause a shift in the G-factor structure.
Abstract
In the psychological literature the term `general intelligence' describes correlations between abilities and not simply the number of abilities. This paper connects Spearman's -factor from psychometrics, measuring a positive manifold, to the implicit ``-factor'' in claims about artificial general intelligence (AGI) performance on temporally structured benchmarks. By treating LLM benchmark batteries as cognitive test batteries and model releases as subjects, principal component analysis is applied to a models benchmarks time matrix spanning 39 models (2019--2025) and 14 benchmarks. Preliminary results confirm a strong positive manifold in which all 28 pairwise correlations positive across 8 benchmarks. By analyzing the spectrum of the benchmark correlation through time, PC1 explains 90\% of variance on a 5-benchmark core battery ()) reducing to 77\% by 2024.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
