Mathematics and Coding are Universal AI Benchmarks

Przemyslaw Chojecki

arXiv:2512.13764·cs.AI·December 17, 2025

Mathematics and Coding are Universal AI Benchmarks

Przemyslaw Chojecki

PDF

Open Access

TL;DR

This paper demonstrates that mathematics and coding serve as universal benchmarks for AI evaluation, with coding being fully universal and mathematics offering spectral universality, facilitating AI self-improvement.

Contribution

It introduces the Mathematics Fiber concept and proves the density of mathematical and coding tasks in the AI benchmark space, highlighting their universal evaluative role.

Findings

01

Coding tasks are dense in the AI benchmark space.

02

Mathematics provides spectral universality, not full expressiveness.

03

Formal proof systems enable stable self-improvement regimes.

Abstract

We study the special role of mathematics and coding inside the moduli space of psychometric batteries for AI agents. Building on the AAI framework and GVU dynamics from previous works, we define the Mathematics Fiber and show that, when paired with formal proof kernels (e.g. Lean, Coq), GVU flows on this fiber admit spectrally stable self-improvement regimes due to oracle-like verification. Our main technical result is a density theorem: under uniform tightness of agent outputs and a Lipschitz AAI functional, the subspace of batteries generated by mathematical theorem-proving and coding tasks is dense in the moduli space of batteries with respect to the evaluation metric. Coding alone is universal in this sense, while pure mathematics is not; its privilege is spectral rather than expressive. We interpret this as evidence that mathematics and coding provide ``universal coordinates'' for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Embodied and Extended Cognition · Explainable Artificial Intelligence (XAI)