Stochastic Parameter Decomposition

Lucius Bushnaq; Dan Braun; Lee Sharkey

arXiv:2506.20790·cs.LG·September 5, 2025

Stochastic Parameter Decomposition

Lucius Bushnaq, Dan Braun, Lee Sharkey

PDF

Open Access 1 Repo

TL;DR

This paper introduces Stochastic Parameter Decomposition (SPD), a scalable and robust method for decomposing neural network parameters into simpler parts, improving upon existing methods like APD in terms of efficiency and accuracy.

Contribution

The paper presents SPD, a novel scalable and hyperparameter-robust decomposition method that enables analysis of larger and more complex neural networks, bridging causal analysis and network interpretability.

Findings

01

SPD is more scalable than APD for larger models

02

SPD is more robust to hyperparameters

03

SPD better identifies ground truth mechanisms in toy models

Abstract

A key step in reverse engineering neural networks is to decompose them into simpler parts that can be studied in relative isolation. Linear parameter decomposition -- a framework that has been proposed to resolve several issues with current decomposition methods -- decomposes neural network parameters into a sum of sparsely used vectors in parameter space. However, the current main method in this framework, Attribution-based Parameter Decomposition (APD), is impractical on account of its computational cost and sensitivity to hyperparameters. In this work, we introduce \textit{Stochastic Parameter Decomposition} (SPD), a method that is more scalable and robust to hyperparameters than APD, which we demonstrate by decomposing models that are slightly larger and more complex than was possible to decompose with APD. We also show that SPD avoids other issues, such as shrinkage of the learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

goodfire-ai/spd
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Simulation Techniques and Applications