Transfer Learning of Linear Regression with Multiple Pretrained Models: Benefiting from More Pretrained Models via Overparameterization Debiasing

Daniel Boharon; Yehuda Dar

arXiv:2602.16531·cs.LG·February 19, 2026

Transfer Learning of Linear Regression with Multiple Pretrained Models: Benefiting from More Pretrained Models via Overparameterization Debiasing

Daniel Boharon, Yehuda Dar

PDF

Open Access

TL;DR

This paper investigates how overparameterized pretrained models can be effectively used in transfer learning for linear regression, proposing a debiasing method to improve performance when leveraging multiple models.

Contribution

It introduces an analytical framework for transfer learning with multiple overparameterized pretrained models and proposes a simple debiasing technique to mitigate overparameterization bias.

Findings

01

Using more overparameterized pretrained models can improve transfer learning.

02

Overparameterization bias can hinder learning, but can be reduced with a multiplicative correction.

03

Debiasing enables leveraging more pretrained models for better target predictor performance.

Abstract

We study transfer learning for a linear regression task using several least-squares pretrained models that can be overparameterized. We formulate the target learning task as optimization that minimizes squared errors on the target dataset with penalty on the distance of the learned model from the pretrained models. We analytically formulate the test error of the learned target model and provide the corresponding empirical evaluations. Our results elucidate when using more pretrained models can improve transfer learning. Specifically, if the pretrained models are overparameterized, using sufficiently many of them is important for beneficial transfer learning. However, the learning may be compromised by overparameterization bias of pretrained models, i.e., the minimum $ℓ_{2}$ -norm solution's restriction to a small subspace spanned by the training examples in the high-dimensional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Face and Expression Recognition