Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity
Lei Yu, Jingcheng Niu, Zining Zhu, Xi Chen, Gerald Penn

TL;DR
DiscoGP introduces a gradient-based pruning framework to identify minimal, modular sheaves within neural language models, preserving core capabilities while significantly reducing model complexity and enhancing interpretability.
Contribution
The paper presents a novel method for extracting sheaves that extend functional circuits by considering both edges and weights, improving interpretability and modularity in LLMs.
Findings
Sheaves preserve 93%-100% of model performance.
Sheaves comprise only 1%-7% of original weights.
DiscoGP outperforms previous circuit identification methods.
Abstract
In this paper, we introduce DiscoGP, a novel framework for extracting self-contained modular units, or sheaves, within neural language models (LMs). Sheaves extend the concept of functional circuits, a unit widely explored in interpretability research, by considering not only subsets of edges in an LM's computation graph but also the model's weight parameters. Our framework identifies sheaves through a gradient-based pruning algorithm that operates on both of these in such a way that reduces the original LM to a sparse skeleton that preserves certain core capabilities. Experimental results demonstrate that, across a range of linguistic and reasoning tasks, DiscoGP extracts sheaves that preserve 93%-100% of a model's performance on the identified task while comprising only 1%-7% of the original weights and connections. Furthermore, our analysis reveals that, compared to previously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLow-power high-performance VLSI design · VLSI and FPGA Design Techniques · Evolutionary Algorithms and Applications
MethodsActivation Patching · Pruning
