Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units
Shrey Dixit, Kayson Fakhar, Fatemeh Hadaeghi, Patrick Mineault, Konrad P. Kording, Claus C. Hilgetag

TL;DR
This paper introduces Multiperturbation Shapley-value Analysis (MSA), a game-theoretic method that quantifies the contribution of neural units to high-dimensional outputs, enhancing interpretability of complex neural networks.
Contribution
The paper presents MSA, a novel, model-agnostic framework that provides detailed, output-aligned contribution maps for neural units across various deep learning architectures.
Findings
MSA reveals regularisation concentrates computation in hubs.
MSA exposes language-specific experts in LLMs.
MSA uncovers an inverted pixel-generation hierarchy in GANs.
Abstract
Neural networks now generate text, images, and speech with billions of parameters, producing a need to know how each neural unit contributes to these high-dimensional outputs. Existing explainable-AI methods, such as SHAP, attribute importance to inputs, but cannot quantify the contributions of neural units across thousands of output pixels, tokens, or logits. Here we close that gap with Multiperturbation Shapley-value Analysis (MSA), a model-agnostic game-theoretic framework. By systematically lesioning combinations of units, MSA yields Shapley Modes, unit-wise contribution maps that share the exact dimensionality of the model's output. We apply MSA across scales, from multi-layer perceptrons to the 56-billion-parameter Mixtral-8x7B and Generative Adversarial Networks (GAN). The approach demonstrates how regularisation concentrates computation in a few hubs, exposes language-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Computability, Logic, AI Algorithms
