Neurons in Large Language Models: Dead, N-gram, Positional
Elena Voita, Javier Ferrando, Christoforos Nalmpantis

TL;DR
This study investigates large language models, revealing that many neurons are inactive or specialized for discrete features, with some neurons explicitly removing information, and positional neurons vary with model size.
Contribution
It provides a lightweight analysis method for large models, uncovering neuron sparsity, feature detection, and information removal mechanisms across different scales.
Findings
Over 70% of neurons can be 'dead' in large models.
Many neurons act as token and n-gram detectors.
Some neurons explicitly remove information from the residual stream.
Abstract
We analyze a family of large language models in such a lightweight manner that can be done on a single GPU. Specifically, we focus on the OPT family of models ranging from 125m to 66b parameters and rely only on whether an FFN neuron is activated or not. First, we find that the early part of the network is sparse and represents many discrete features. Here, many neurons (more than 70% in some layers of the 66b model) are "dead", i.e. they never activate on a large collection of diverse data. At the same time, many of the alive neurons are reserved for discrete features and act as token and n-gram detectors. Interestingly, their corresponding FFN updates not only promote next token candidates as could be expected, but also explicitly focus on removing the information about triggering them tokens, i.e., current input. To the best of our knowledge, this is the first example of mechanisms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Topic Modeling · Machine Learning in Materials Science
MethodsFocus · OPT
