TL;DR
StructLens is a novel framework that uses maximum spanning trees to analyze and visualize the internal structural organization of language model representations across layers and training stages.
Contribution
It introduces a holistic structural analysis method for language model representations, revealing how token relationships and organizational units evolve during pre-training.
Findings
Middle layers exhibit the strongest local-span organization.
Smaller local units are detectable earlier in training, larger units emerge later.
StructLens provides new insights into token organization in language models.
Abstract
Language exhibits inherent structures, a property that explains both language acquisition and language change. Given this characteristic, we expect language models to manifest their own internal structures as well. While interpretability research has investigated how models compute representations mechanistically through attention patterns and Sparse AutoEncoders, the organization of the resulting representations is overlooked. To address this gap, we introduce StructLens, a framework to analyze representations through a holistic structural view. StructLens constructs maximum spanning trees based on the semantic representations in residual streams, inspired by tree representation in dependency parsing, and provides summaries of token relationships in representation space. We analyze how contiguous tokens are also nearby in representation space and find that middle layers show the…
Peer Reviews
Decision·Submitted to ICLR 2026
The idea of using a tree to define a structure summary of a layer's representation is interesting. The formulas and construction are clearly stated. The empirical patterns are visually compelling. The demonstrations of the correlation with confidence degradation and the pruning case study provide practical usefulness.
Some simple baselines are not compared, and key design choices are insufficiently justified. Without these pieces, claims about superiority and practical utility are not yet established. ### 1. Baseline to be compared The paper contrasts StructLens primarily with token-aligned cosine (Eq. 6). However, global inter-layer similarity is standardly assessed with Centered Kernel Alignment (CKA) and close relatives SVCCA/PWCCA. For example, for two layer-representation matrices $X \in \mathbb{R}^{N
- StructLens is an original approach to language model interpretability, offering a global structural perspective that complements existing token-level and attention-based analyses. - The paper provides clear mathematical formulations for tree construction and for the presented similarity metrics. - The exploration of structure-aware metrics for layer pruning is practical and connects interpretability with model compression.
- Only 50 instances per dataset is a small sample to obtain reliable or generalizable insights. - The results obtained in Section 4.2 are on a single instance of MMLU, which is too limited to extract conclusions (and occupy an entire page). - The layer pruning results in Table 5 are inconsistent. In some cases, structure-aware metrics underperform base cosine similarity. There is no statistical significance analysis. - The findings presented in the paper, e.g., the "island" patterns in Edge-Edit
The paper is clearly structured, and the framework is highly practical. The choice of algorithms and the detailed methodological treatment are commendable, particularly the consideration of a single root node to ensure consistency. The tree-based indicators proposed in the paper are effectively applied and validated in these experiments. Additionally, the case study presented in *Section 4.2: FREQUENT SUBTREES* is a clever choice, effectively illustrating the relationship between language struct
(1) The feasibility of MST calculation and its related algorithms requires further verification. For very large models or long token sequences, the computational cost of this algorithm can be substantial. Moreover, the **Edge-Edit** and **Tree-Edit** indicators involve multiple operations which may further reduce computational efficiency. (2) The paper only conducts experiments only on **Llama 3.1 and Qwen 2.5**, limiting the number of models studied. The choice of models could be improved, esp
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Natural Language Processing Techniques
