Statistical mechanics of transfer learning in fully-connected networks in the proportional limit
Alessandro Ingrosso, Rosalba Pacelli, Pietro Rotondo, Federica Gerace

TL;DR
This paper develops a statistical mechanics framework for transfer learning in fully-connected neural networks within the proportional limit, revealing how TL effectiveness depends on a renormalized kernel related to source-target task similarity.
Contribution
It introduces a novel Franz-Parisi formalism to analyze transfer learning in the proportional regime, showing TL's dependence on a kernel that measures task relatedness.
Findings
Transfer learning effectiveness depends on a renormalized kernel.
In the proportional limit, TL can be beneficial due to kernel effects.
The framework differs from lazy training by capturing feature learning dynamics.
Abstract
Transfer learning (TL) is a well-established machine learning technique to boost the generalization performance on a specific (target) task using information gained from a related (source) task, and it crucially depends on the ability of a network to learn useful features. Leveraging recent analytical progress in the proportional regime of deep learning theory (i.e. the limit where the size of the training set and the size of the hidden layers are taken to infinity keeping their ratio finite), in this work we develop a novel single-instance Franz-Parisi formalism that yields an effective theory for TL in fully-connected neural networks. Unlike the (lazy-training) infinite-width limit, where TL is ineffective, we demonstrate that in the proportional limit TL occurs due to a renormalized source-target kernel that quantifies their relatedness and determines whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM
