Uncovering hidden geometry in Transformers via disentangling position and context
Jiajun Song, Yiqiao Zhong

TL;DR
This paper introduces a decomposition method for transformer embeddings into interpretable components, revealing hidden geometric structures related to position and context, which enhances understanding of their internal representations.
Contribution
It presents a simple decomposition of transformer embeddings into mean, position, context, and residual components, uncovering geometric structures and improving interpretability.
Findings
Position vectors form low-dimensional spiral shapes across layers.
Context vectors cluster into meaningful topic groups.
Position and context vectors are nearly orthogonal.
Abstract
Transformers are widely used to extract semantic meanings from input tokens, yet they usually operate as black-box models. In this paper, we present a simple yet informative decomposition of hidden states (or embeddings) of trained transformers into interpretable components. For any layer, embedding vectors of input sequence samples are represented by a tensor . Given embedding vector at sequence position in a sequence (or context) , extracting the mean effects yields the decomposition \[ \boldsymbol{h}_{c,t} = \boldsymbol{\mu} + \mathbf{pos}_t + \mathbf{ctx}_c + \mathbf{resid}_{c,t} \] where is the global mean vector, and are the mean vectors across contexts and across positions respectively, and is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Hand Gesture Recognition Systems · Robotics and Sensor-Based Localization
