NeuroSurgeon: A Toolkit for Subnetwork Analysis
Michael A. Lepori, Ellie Pavlick, Thomas Serre

TL;DR
NeuroSurgeon is a Python toolkit designed to identify and manipulate subnetworks within neural network models, aiding in the interpretability and understanding of learned representations.
Contribution
The paper introduces NeuroSurgeon, a novel Python library that enables subnetwork discovery and manipulation in Huggingface Transformer models, facilitating explainability research.
Findings
Enables analysis of functional circuits in neural networks
Supports manipulation of subnetworks for interpretability
Accessible as open-source Python library
Abstract
Despite recent advances in the field of explainability, much remains unknown about the algorithms that neural networks learn to represent. Recent work has attempted to understand trained models by decomposing them into functional circuits (Csord\'as et al., 2020; Lepori et al., 2023). To advance this research, we developed NeuroSurgeon, a python library that can be used to discover and manipulate subnetworks within models in the Huggingface Transformers library (Wolf et al., 2019). NeuroSurgeon is freely available at https://github.com/mlepori1/NeuroSurgeon.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Explainable Artificial Intelligence (XAI) · Cell Image Analysis Techniques
MethodsLib
