On the geometry and topology of representations: the manifolds of modular addition
Gabriela Moisescu-Pareja, Gavin McCracken, Harley Wiltzer, Vincent L\'etourneau, Colin Daniels, Doina Precup, Jonathan Love

TL;DR
This paper demonstrates that different neural architectures for modular addition, whether uniform or learnable attention, produce topologically equivalent representations, revealing a shared underlying geometric structure.
Contribution
It introduces a topological and geometric analysis method to compare neural representations, showing that different architectures implement the same algorithm through equivalent manifolds.
Findings
Uniform and trainable attention architectures are topologically equivalent.
Learned representations form manifolds that can be statistically compared.
Different architectures naturally develop similar modular addition circuits.
Abstract
The Clock and Pizza interpretations, associated with architectures differing in either uniform or learnable attention, were introduced to argue that different architectural designs can yield distinct circuits for modular addition. In this work, we show that this is not the case, and that both uniform attention and trainable attention architectures implement the same algorithm via topologically and geometrically equivalent representations. Our methodology goes beyond the interpretation of individual neurons and weights. Instead, we identify all of the neurons corresponding to each learned representation and then study the collective group of neurons as one entity. This method reveals that each learned representation is a manifold that we can study utilizing tools from topology. Based on this insight, we can statistically analyze the learned representations across hundreds of circuits to…
Peer Reviews
Decision·ICLR 2026 Poster
- The question of universal representations is important and modular addition is a good setup to study this question. - The topological/geometric approach is interesing and sheds light on the representations learned by neural networks. - The paper studies a diverse and comprehensive set of four architectures: MLP-add, MLP-concat, Attention 0.0 (transformers with constant attention), Attention 1.0 (transformers with learnable attention). - The paper combines theoretical analysis with a comprehens
1. The paper uses ambiguous language that might lead the reader to confuse neural network architectures and the mechanisms learned by these networks. For example, in lines 116-118, the terms Clock and Pizza are introduced as two different architectures, when in fact they are different mechanisms learnable by the same architecture (the standard transformer block: MLP + Softmax Attention). In lines 151-154, the authors are making a statement seemingly about archictures, when in reality they are re
Theoretical clarity: Derivation of closed-form preactivation manifolds (Theorem 4.1) provides a clear and elegant link between Fourier structure and representation geometry. Unified interpretation: The paper offers a coherent view showing that previously “distinct” circuits (Clock vs Pizza) are mathematically equivalent, reinforcing the universality hypothesis. Methodological novelty: Introduces quantitative topology tools (Betti distributions, PAD + MMD metrics) for large-scale circuit compar
Missing connection to real LLMs: https://arxiv.org/abs/2406.03445 shows that similar Fourier-modular features emerge in large pretrained transformers, yet this paper does not discuss whether such real representations share the same torus or disc topology. Integrating this perspective could help demonstrate broader relevance beyond synthetic tasks. Insufficient engagement with prior findings: https://arxiv.org/pdf/2311.07568 and https://arxiv.org/pdf/2402.09469 already show that linear superposi
The paper's key strength is the depth of the empirical analysis. The authors consider a number of techniques to probe the representations learned by the networks including PCA, activation strengths, PAD and Betti numbers. This is all done for a number of different architectures. The theoretical analysis also seems sound and is convincing. Overall, the paper would likely be of significance to those in mechanistic interpretability, particularly those studying modular addition.
In my view, the paper doesn't have many weaknesses. My main critique is on clarity: sections 4.1 and 5 could benefit from more intuition provided for the theorem and proposed analyses. For instance, it's not immediately obvious what the factorization structure of X in Theorem 4.1 has to do with geometry. Similarly, in the description of PAD, the significance of a strong diagonal is never explicitly explained. I would encourage the authors to use their extra page in the revision to more slowly wa
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Neural Networks and Reservoir Computing · Ferroelectric and Negative Capacitance Devices
