Probing transfer learning with a model of synthetic correlated datasets

Federica Gerace; Luca Saglietti; Stefano Sarao Mannelli; Andrew Saxe,; Lenka Zdeborov\'a

arXiv:2106.05418·cs.LG·June 6, 2023

Probing transfer learning with a model of synthetic correlated datasets

Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe,, Lenka Zdeborov\'a

PDF

TL;DR

This paper introduces a synthetic dataset model to analytically study transfer learning, revealing conditions where feature transfer improves neural network generalization in binary classification tasks.

Contribution

It provides a solvable, analytic framework for understanding transfer learning effects using correlated synthetic datasets, bridging theory and practice.

Findings

01

Transfer learning benefits depend on dataset correlation.

02

The model captures key features of real transfer learning.

03

Systematic analysis of when feature transfer is advantageous.

Abstract

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task. Despite years of successful applications, transfer learning practice often relies on ad-hoc solutions, while theoretical understanding of these procedures is still limited. In the present work, we re-think a solvable model of synthetic data as a framework for modeling correlation between data-sets. This setup allows for an analytic characterization of the generalization performance obtained when transferring the learned feature map from the source to the target task. Focusing on the problem of training two-layer networks in a binary classification setting, we show that our model can capture a range of salient features of transfer learning with real data. Moreover, by exploiting parametric control over the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.