Modularity in Transformers: Investigating Neuron Separability & Specialization
Nicholas Pochinkov, Thomas Jones, Mohammed Rashidur Rahman

TL;DR
This paper explores the internal neuron structure of transformer models, revealing task-specific clusters and inherent modularity that can inform interpretability and efficiency improvements.
Contribution
It introduces a novel analysis combining pruning and MoEfication clustering to uncover neuron specialization and overlap across tasks in transformer models.
Findings
Neuron clusters are task-specific with some overlap.
Neuron importance patterns persist even in random models.
MoEfication clusters align with task-specific neurons in different layers.
Abstract
Transformer models are increasingly prevalent in various applications, yet our understanding of their internal workings remains limited. This paper investigates the modularity and task specialization of neurons within transformer architectures, focusing on both vision (ViT) and language (Mistral 7B) models. Using a combination of selective pruning and MoEfication clustering techniques, we analyze the overlap and specialization of neurons across different tasks and data subsets. Our findings reveal evidence of task-specific neuron clusters, with varying degrees of overlap between related tasks. We observe that neuron importance patterns persist to some extent even in randomly initialized models, suggesting an inherent structure that training refines. Additionally, we find that neuron clusters identified through MoEfication correspond more strongly to task-specific neurons in earlier and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsPruning
