Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

Alessandro Breccia; Federica Gerace; Marco Lippi; Gabriele Sicuro; Pierluigi Contucci

arXiv:2512.01870·cs.AI·December 2, 2025

Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

Alessandro Breccia, Federica Gerace, Marco Lippi, Gabriele Sicuro, Pierluigi Contucci

PDF

Open Access

TL;DR

This paper investigates whether a transformer model can learn the structured sequence of rooted trees generated by prime factorization of natural numbers, revealing partial understanding of the underlying arithmetic grammar.

Contribution

It demonstrates that a GPT-2 model trained on a large sequence of rooted trees can partially learn the internal arithmetic structure and regularities.

Findings

01

Model captures non-trivial regularities in the sequence

02

Transformer exhibits partial understanding of arithmetic grammar

03

Learnability extends beyond empirical data to arithmetic structures

Abstract

We study whether a Large Language Model can learn the deterministic sequence of trees generated by the iterated prime factorization of the natural numbers. Each integer is mapped into a rooted planar tree and the resulting sequence $N T$ defines an arithmetic text with measurable statistical structure. A transformer network (the GPT-2 architecture) is trained from scratch on the first $1 0^{11}$ elements to subsequently test its predictive ability under next-word and masked-word prediction tasks. Our results show that the model partially learns the internal grammar of $N T$ , capturing non-trivial regularities and correlations. This suggests that learnability may extend beyond empirical data to the very structure of arithmetic.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Topic Modeling · Computability, Logic, AI Algorithms