Embryology of a Language Model

George Wang; Garrett Baker; Andrew Gordon; Daniel Murfet

arXiv:2508.00331·cs.LG·August 4, 2025

Embryology of a Language Model

George Wang, Garrett Baker, Andrew Gordon, Daniel Murfet

PDF

Open Access 4 Reviews

TL;DR

This paper introduces an embryological visualization method using UMAP on susceptibility matrices to reveal the developmental structure of language models, uncovering known and novel features in their internal organization.

Contribution

It presents a new embryological approach for visualizing neural network development, combining susceptibility analysis with UMAP to uncover internal structures.

Findings

01

Revealed the emergence of a 'body plan' in model development

02

Discovered a new 'spacing fin' structure for counting space tokens

03

Demonstrated susceptibility analysis as a tool for uncovering neural mechanisms

Abstract

Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer a promising analytical tool, their full potential for visualizing network organization remains untapped. In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training. Our visualizations reveal the emergence of a clear ``body plan,'' charting the formation of known features like the induction circuit and discovering previously unknown structures, such as a ``spacing fin'' dedicated to counting space tokens. This work demonstrates that susceptibility analysis can move beyond validation to uncover novel mechanisms, providing a powerful, holistic lens for studying the developmental principles of…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 0Confidence 4

Strengths

**Originality: poor** Application of UMAP to high-dimensional susceptibility vectors for visualization is a straightforward combination of existing methodologies rather than a substantial advance. The framing of “embryology” and “body plan” in neural networks reads more as metaphorical novelty than genuine technical originality. **Quality: poor** The overall quality of the work is limited by a lack of rigorous experimental validation and an overreliance on qualitative visualization. Key claims

Weaknesses

**Unsound/superficial application of UMAP.** - The paper relies heavily on UMAP for what amount to no more than qualitative visualizations. The limitations of UMAP for interpretability are identified in the appendix, and others are well-documented in various literatures. As a consequence, "anatomical" claims lack quantitative rigor to support the "serpent" as a stable, robust feature rather than a visualization artifact. The authors acknowledge that the geometry of the "serpent" should be "nter

Reviewer 02Rating 4Confidence 2

Strengths

Overall, I believe the authors have studied an interesting question and created various intriguing visualizations that could be of interest to the community. The paper presentation was good, and the figures are very nicely presented. In particular, 1. The "embryological" analogy, framing model training as a developmental process, is conceptually intuitive yet powerful. Applying UMAP to susceptibility vectors, rather than just model activations, is a novel approach. It provides a global visuali

Weaknesses

The major limitations are already straightforwardly discussed in the paper. Here I rephrase the two most important ones in my opinion: 1. The experiments are pretty severely limited by scale, as they were only visualized with a tiny 3M model with two layers of attention-only modules. It is a pretty significant leap from even tiny-scaled language models by today's standards. This is (in my opinion) the most significant weakness of the present paper, and it is not clear what the limitation is for

Reviewer 03Rating 2Confidence 3

Strengths

- The approach is unlike most views i've seen in terms of interpreting LLMs and is creative/novel. - The paper presents a fairly holistic view of interpretability in LLMs. Instead of focusing on single circuits, the method reveals global organization and complementary expression/suppression roles across heads. - Joint use of UMAP snapshots and per-pattern susceptibility trajectories might point to plausible temporal causal structure emergence.

Weaknesses

- The results presented in the paper are entirely based on a 3M, 2-layer attention-only model. It’s unclear whether the serpent structure and spacing fin persist or change in mid/large LMs with MLPs and modern tokenizers (e.g., non-whitespace-heavy merges). - Although partly addressed, the method still relies on a nonlinear, stochastic embedding with known global-distance distortions; the work would benefit from corroboration via isometry-aware metrics in the original space. - The approach may

Reviewer 04Rating 6Confidence 3

Strengths

**Originality.** * Using **susceptibility vectors** (rather than activations) to visualize response is an interesting perspective that complements circuit-centric methods. The "spacing fin" is a surprising, concrete emergent structure that the authors unraveled with their visualization. * The biological metaphor (anterior–posterior / dorsal–ventral axes) is consistant through the manuscript and helps organize observations about stratification by token pattern. **Quality / Technical soundness.

Weaknesses

**The probabilistic setup and tractability of the quenched posterior need more transparency.** * Eq. (2) introduces the posterior ( $ p^{\beta}\_{n}(w) \propto \exp \\{-n \beta L(w) \\} \phi(w) $ ) with normalizer ( $ Z^{\beta}\_{n} $ ). The *practical* tractability of ( $ Z^{\beta}\_{n} $ ) and how its intractability propagates (or cancels) in ( $ \chi $ ) estimates are not discussed. **Motivation for Def. 2.1 could be surfaced earlier.** The definition of susceptibility (Def. 2.1) appears b

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Machine Learning in Materials Science · Neural Networks and Applications