Probing Neural Topology of Large Language Models

Yu Zheng; Yuan Yuan; Yue Zhuo; Yong Li; Gabriel Kreiman; Tomaso Poggio; Paolo Santi

arXiv:2506.01042·cs.CL·January 30, 2026

Probing Neural Topology of Large Language Models

Yu Zheng, Yuan Yuan, Yue Zhuo, Yong Li, Gabriel Kreiman, Tomaso Poggio, Paolo Santi

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces graph probing, a method to analyze the neural topology of large language models, revealing that their performance can be predicted and improved by understanding their neural connectivity structures.

Contribution

The work presents a novel graph probing technique that uncovers neural connectivity in LLMs, demonstrating its superiority over activation-based probing and its potential for model optimization.

Findings

01

Neural topology predicts language performance with high accuracy.

02

Topology-based probing outperforms activation-based methods significantly.

03

Identifies key network structures like default networks and hub neurons.

Abstract

Probing large language models (LLMs) has yielded valuable insights into their internal mechanisms by linking neural activations to interpretable semantics. However, the complex mechanisms that link neuron's functional co-activation with the emergent model capabilities remains largely unknown, hindering a deeper understanding and safer development of LLMs. In this work, we introduce graph probing, a method for uncovering the functional connectivity of LLM neurons and relating it to language generation performance. By probing models across diverse LLM families and scales, we discover a universal predictability of language generation and understanding performance using only neural topology, which persists even when retaining just 1% of neuron connections. Strikingly, probing on topology outperforms probing on activation by up to 130.4% and 67.7% on perplexity and space/time semantic…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 5

Strengths

1.The integration of functional connectivity concepts from neuroscience into LLM analysis is conceptually innovative and may contribute to mechanistic interpretability. 2.The authors have demonstrated that functional connectivity outperforms activation in predicting model performance, i.e., perplexity. 3.The authors have demonstrated the promising potential of “graph probing” for practical applications, such as model pruning, hallucination detection, and model fingerprinting.

Weaknesses

1.The manuscript is kind of overloaded while the validation of each point seems insufficient. 2.Some interpretations of the results might be overstated. 3.Experimental results on large scale models is missing.

Reviewer 02Rating 8Confidence 3

Strengths

I find the paper well written and the empirical results are solid and sound. In terms of the main result, it shows neural topology probes reliably predict next-token performance and outperform activation-only baselines across multiple LLM families/sizes, with careful 8:2 train/test splits and standard regression metrics. The experimental setups are valid. The paper also shows that the results are robust. In particular, performance persists under heavy graph sparsification and across model sizes

Weaknesses

While the paper identifies a default network, I find it insufficient in explaining exactly what the network does. In addition, I'd love to see a mechanistic interpretation of “why” topology predicts performance. If I understand correctly, building the per-sequence correlation graphs requires collecting full hidden-state time series and computing pairwise correlations, then training a probe. Although sparsification helps at inference time, the initial topology extraction can be costly in $n$ and

Reviewer 03Rating 2Confidence 4

Strengths

This paper is well-written and polished: the text is clear, and the graphs are well-constructed. I also think that this particularly style of probing is novel: I haven't heard of others using it before.

Weaknesses

**Unclear Purpose of Proposed Method**: The primary use of probing in interpretability is, as noted by this paper, "linking neural activations to interpretable semantics". But this paper never actually does this, which is the one thing that one would expect from a probing paper. In fact, since it's restricted to sequence-level probing, it can't do many probing tasks, which are often token-level. Instead, it probes for things like perplexity (why would you want a probe to predict that?) and use

Code & Models

Repositories

DavyMorgan/llm-graph-probing
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling