Instilling Inductive Biases with Subnetworks
Enyan Zhang, Michael A. Lepori, Ellie Pavlick

TL;DR
This paper introduces Subtask Induction, a mechanistic approach to instill inductive biases in neural networks by discovering and leveraging functional subnetworks, improving data efficiency and task generalization.
Contribution
The paper proposes a novel Subtask Induction method to embed inductive biases via subnetworks, enhancing model efficiency and generalization capabilities.
Findings
Reduces training data needed for modular arithmetic tasks.
Induces human-like shape bias in image classification models.
Effective for both convolutional and transformer architectures.
Abstract
Despite the recent success of artificial neural networks on a variety of tasks, we have little knowledge or control over the exact solutions these models implement. Instilling inductive biases -- preferences for some solutions over others -- into these models is one promising path toward understanding and controlling their behavior. Much work has been done to study the inherent inductive biases of models and instill different inductive biases through hand-designed architectures or carefully curated training regimens. In this work, we explore a more mechanistic approach: Subtask Induction. Our method discovers a functional subnetwork that implements a particular subtask within a trained model and uses it to instill inductive biases towards solutions utilizing that subtask. Subtask Induction is flexible and efficient, and we demonstrate its effectiveness with two experiments. First, we…
Peer Reviews
Decision·Submitted to ICLR 2024
I really enjoyed this paper - it takes the observation from the mechanistic interpretability literature that deep networks learn subnetworks to solve specific tasks and uses it to derive a simple method for extracting these subnetworks (essentially by optimizing a sparsity mask over the parameter on a distribution of problems that only require the subnetwork) and then they randomly initialize the remaining weights of a network and train on a second task that requires the shared skill. The result
The requirement for a dataset like mean-pooled imagenet to extract the subnetwork significantly constrains how widely applicable this paper is as a method---it essentially requires you know the task in advance and how to specify it with examples that are sufficiently different from typical examples in the training set---but I still think that it is an interesting demonstration. I would have liked to see some examples of where it fails: for example, if you don't have a clear separation between I
- The method can potentially help solving, in the long run, problems related to spurious correlations, out-of-domain generalization, and more. - A well-documented code is made available through an anonymous link.
- The method is valid for one subtask, but extending it to more than one is non trivial and perhaps not possible, since the subnetworks could either overlap or occupy the whole network (in which case we would just be doing transfer learning). - It is not easy to identify the subnetwork of a model that performs a subtask. I see this as a major limitation, but it doesn't seem impossible to me that in the future there could be some solutions. - The method relies on finding a subnetwork, which see
Overall the problem statement is quite interesting and the presentation is very clear and easy to follow.
I found the motivation to be a little lacking. The authors mentioned mechanistic interpretability, however, it is unclear to me (based on the current writing) why we should care about such interpretability. I would encourage the authors to give some examples to demonstrate how it can be used in reality. For example, what are some scenarios in which we can accurately define subtasks (in the vision experiment you mention shape vs texture, I understand that the problem is extensively studied, but i
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Neural Networks and Applications
