Inductive biases of multi-task learning and finetuning: multiple regimes   of feature reuse

Samuel Lippl; Jack W. Lindsey

arXiv:2310.02396·cs.LG·November 4, 2024·2 cites

Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse

Samuel Lippl, Jack W. Lindsey

PDF

Open Access 1 Repo

TL;DR

This paper investigates the implicit regularization biases in multi-task learning and finetuning, revealing feature reuse patterns, a novel nested feature selection regime, and their impact on neural network performance.

Contribution

It characterizes the inductive biases of MTL and PT+FT, introduces the nested feature selection regime, and demonstrates how weight rescaling can enhance finetuning in deep networks.

Findings

01

MTL and PT+FT favor feature reuse and sparsity

02

Nested feature selection is a distinct regime in PT+FT

03

Weight rescaling improves finetuning performance

Abstract

Neural networks are often trained on multiple tasks, either simultaneously (multi-task learning, MTL) or sequentially (pretraining and subsequent finetuning, PT+FT). In particular, it is common practice to pretrain neural networks on a large auxiliary task before finetuning on a downstream task with fewer samples. Despite the prevalence of this approach, the inductive biases that arise from learning multiple tasks are poorly characterized. In this work, we address this gap. We describe novel implicit regularization penalties associated with MTL and PT+FT in diagonal linear networks and single-hidden-layer ReLU networks. These penalties indicate that MTL and PT+FT induce the network to reuse features in different ways. 1) Both MTL and PT+FT exhibit biases towards feature reuse between tasks, and towards sparsity in the set of learned features. We show a "conservation law" that implies a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sflippl/multi-task
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsFeature Selection