How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning   -- an Exact Macroscopic Characterization

Jakob Heiss; Josef Teichmann; Hanna Wutte

arXiv:2112.15577·cs.LG·October 21, 2022

How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning -- an Exact Macroscopic Characterization

Jakob Heiss, Josef Teichmann, Hanna Wutte

PDF

1 Repo

TL;DR

This paper demonstrates that wide ReLU neural networks with L2 regularization can effectively benefit from multi-task learning even in the infinite-width limit, due to their ability to learn shared representations.

Contribution

It provides an exact characterization of how infinite-width ReLU networks with regularization can support multi-task learning through representation learning.

Findings

01

Infinite-width ReLU networks with regularization promote multi-task learning.

02

Representation learning persists in the infinite-width limit for regularized networks.

03

Traditional infinite-width limits like neural tangent kernels do not support multi-task learning.

Abstract

In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the well-known infinite-width limits of NNs in the literature (e.g., neural tangent kernels) assume specific settings in which wide ReLU-NNs behave like shallow Gaussian Processes with a fixed kernel. Consequently, in such settings, these NNs lose their ability to benefit from multi-task learning in the infinite-width limit. In contrast, we prove that optimizing wide ReLU neural networks with at least one hidden layer using L2-regularization on the parameters promotes multi-task learning due to representation-learning - also in the limiting regime where the network width tends to infinity. We present an exact quantitative characterization of this infinite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JakobHeiss/NN_regularization1
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.