Embryology of a Language Model
George Wang, Garrett Baker, Andrew Gordon, Daniel Murfet

TL;DR
This paper introduces an embryological visualization method using UMAP on susceptibility matrices to reveal the developmental structure of language models, uncovering known and novel features in their internal organization.
Contribution
It presents a new embryological approach for visualizing neural network development, combining susceptibility analysis with UMAP to uncover internal structures.
Findings
Revealed the emergence of a 'body plan' in model development
Discovered a new 'spacing fin' structure for counting space tokens
Demonstrated susceptibility analysis as a tool for uncovering neural mechanisms
Abstract
Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer a promising analytical tool, their full potential for visualizing network organization remains untapped. In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training. Our visualizations reveal the emergence of a clear ``body plan,'' charting the formation of known features like the induction circuit and discovering previously unknown structures, such as a ``spacing fin'' dedicated to counting space tokens. This work demonstrates that susceptibility analysis can move beyond validation to uncover novel mechanisms, providing a powerful, holistic lens for studying the developmental principles of…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
**Originality: poor** Application of UMAP to high-dimensional susceptibility vectors for visualization is a straightforward combination of existing methodologies rather than a substantial advance. The framing of “embryology” and “body plan” in neural networks reads more as metaphorical novelty than genuine technical originality. **Quality: poor** The overall quality of the work is limited by a lack of rigorous experimental validation and an overreliance on qualitative visualization. Key claims
**Unsound/superficial application of UMAP.** - The paper relies heavily on UMAP for what amount to no more than qualitative visualizations. The limitations of UMAP for interpretability are identified in the appendix, and others are well-documented in various literatures. As a consequence, "anatomical" claims lack quantitative rigor to support the "serpent" as a stable, robust feature rather than a visualization artifact. The authors acknowledge that the geometry of the "serpent" should be "nter
Overall, I believe the authors have studied an interesting question and created various intriguing visualizations that could be of interest to the community. The paper presentation was good, and the figures are very nicely presented. In particular, 1. The "embryological" analogy, framing model training as a developmental process, is conceptually intuitive yet powerful. Applying UMAP to susceptibility vectors, rather than just model activations, is a novel approach. It provides a global visuali
The major limitations are already straightforwardly discussed in the paper. Here I rephrase the two most important ones in my opinion: 1. The experiments are pretty severely limited by scale, as they were only visualized with a tiny 3M model with two layers of attention-only modules. It is a pretty significant leap from even tiny-scaled language models by today's standards. This is (in my opinion) the most significant weakness of the present paper, and it is not clear what the limitation is for
- The approach is unlike most views i've seen in terms of interpreting LLMs and is creative/novel. - The paper presents a fairly holistic view of interpretability in LLMs. Instead of focusing on single circuits, the method reveals global organization and complementary expression/suppression roles across heads. - Joint use of UMAP snapshots and per-pattern susceptibility trajectories might point to plausible temporal causal structure emergence.
- The results presented in the paper are entirely based on a 3M, 2-layer attention-only model. It’s unclear whether the serpent structure and spacing fin persist or change in mid/large LMs with MLPs and modern tokenizers (e.g., non-whitespace-heavy merges). - Although partly addressed, the method still relies on a nonlinear, stochastic embedding with known global-distance distortions; the work would benefit from corroboration via isometry-aware metrics in the original space. - The approach may
**Originality.** * Using **susceptibility vectors** (rather than activations) to visualize response is an interesting perspective that complements circuit-centric methods. The "spacing fin" is a surprising, concrete emergent structure that the authors unraveled with their visualization. * The biological metaphor (anterior–posterior / dorsal–ventral axes) is consistant through the manuscript and helps organize observations about stratification by token pattern. **Quality / Technical soundness.
**The probabilistic setup and tractability of the quenched posterior need more transparency.** * Eq. (2) introduces the posterior ( $ p^{\beta}\_{n}(w) \propto \exp \\{-n \beta L(w) \\} \phi(w) $ ) with normalizer ( $ Z^{\beta}\_{n} $ ). The *practical* tractability of ( $ Z^{\beta}\_{n} $ ) and how its intractability propagates (or cancels) in ( $ \chi $ ) estimates are not discussed. **Motivation for Def. 2.1 could be surfaced earlier.** The definition of susceptibility (Def. 2.1) appears b
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Machine Learning in Materials Science · Neural Networks and Applications
