Visualizing LLM Latent Space Geometry Through Dimensionality Reduction
Alex Ning, Vainateya Rangaraju, Yen-Ling Kuo

TL;DR
This paper visualizes and analyzes the geometric structure of latent spaces in Transformer-based language models like GPT-2 and LLaMa using dimensionality reduction techniques, revealing new insights into their internal mechanisms.
Contribution
It introduces a systematic approach to visualize and interpret latent space geometries in LLMs, uncovering novel patterns such as separation of attention and MLP outputs.
Findings
Clear separation between attention and MLP components in latent space
High norm of initial sequence position states
Helical structure of positional embeddings
Abstract
Large language models (LLMs) achieve state-of-the-art results across many natural language tasks, but their internal mechanisms remain difficult to interpret. In this work, we extract, process, and visualize latent state geometries in Transformer-based language models through dimensionality reduction. We capture layerwise activations at multiple points within Transformer blocks and enable systematic analysis through Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP). We demonstrate experiments on GPT-2 and LLaMa models, where we uncover interesting geometric patterns in latent space. Notably, we identify a clear separation between attention and MLP component outputs across intermediate layers, a pattern not documented in prior work to our knowledge. We also characterize the high norm of latent states at the initial sequence position and visualize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques
