Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units

Shrey Dixit; Kayson Fakhar; Fatemeh Hadaeghi; Patrick Mineault; Konrad P. Kording; Claus C. Hilgetag

arXiv:2506.19732·cs.LG·June 25, 2025

Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units

Shrey Dixit, Kayson Fakhar, Fatemeh Hadaeghi, Patrick Mineault, Konrad P. Kording, Claus C. Hilgetag

PDF

Open Access

TL;DR

This paper introduces Multiperturbation Shapley-value Analysis (MSA), a game-theoretic method that quantifies the contribution of neural units to high-dimensional outputs, enhancing interpretability of complex neural networks.

Contribution

The paper presents MSA, a novel, model-agnostic framework that provides detailed, output-aligned contribution maps for neural units across various deep learning architectures.

Findings

01

MSA reveals regularisation concentrates computation in hubs.

02

MSA exposes language-specific experts in LLMs.

03

MSA uncovers an inverted pixel-generation hierarchy in GANs.

Abstract

Neural networks now generate text, images, and speech with billions of parameters, producing a need to know how each neural unit contributes to these high-dimensional outputs. Existing explainable-AI methods, such as SHAP, attribute importance to inputs, but cannot quantify the contributions of neural units across thousands of output pixels, tokens, or logits. Here we close that gap with Multiperturbation Shapley-value Analysis (MSA), a model-agnostic game-theoretic framework. By systematically lesioning combinations of units, MSA yields Shapley Modes, unit-wise contribution maps that share the exact dimensionality of the model's output. We apply MSA across scales, from multi-layer perceptrons to the 56-billion-parameter Mixtral-8x7B and Generative Adversarial Networks (GAN). The approach demonstrates how regularisation concentrates computation in a few hubs, exposes language-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Computability, Logic, AI Algorithms