Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity

Lei Yu; Jingcheng Niu; Zining Zhu; Xi Chen; Gerald Penn

arXiv:2407.03779·cs.CL·September 30, 2025

Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity

Lei Yu, Jingcheng Niu, Zining Zhu, Xi Chen, Gerald Penn

PDF

Open Access 1 Video

TL;DR

DiscoGP introduces a gradient-based pruning framework to identify minimal, modular sheaves within neural language models, preserving core capabilities while significantly reducing model complexity and enhancing interpretability.

Contribution

The paper presents a novel method for extracting sheaves that extend functional circuits by considering both edges and weights, improving interpretability and modularity in LLMs.

Findings

01

Sheaves preserve 93%-100% of model performance.

02

Sheaves comprise only 1%-7% of original weights.

03

DiscoGP outperforms previous circuit identification methods.

Abstract

In this paper, we introduce DiscoGP, a novel framework for extracting self-contained modular units, or sheaves, within neural language models (LMs). Sheaves extend the concept of functional circuits, a unit widely explored in interpretability research, by considering not only subsets of edges in an LM's computation graph but also the model's weight parameters. Our framework identifies sheaves through a gradient-based pruning algorithm that operates on both of these in such a way that reduces the original LM to a sparse skeleton that preserves certain core capabilities. Experimental results demonstrate that, across a range of linguistic and reasoning tasks, DiscoGP extracts sheaves that preserve 93%-100% of a model's performance on the identified task while comprising only 1%-7% of the original weights and connections. Furthermore, our analysis reveals that, compared to previously…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity· underline

Taxonomy

TopicsLow-power high-performance VLSI design · VLSI and FPGA Design Techniques · Evolutionary Algorithms and Applications

MethodsActivation Patching · Pruning