Expectation Error Bounds for Transfer Learning in Linear Regression and Linear Neural Networks

Meitong Liu; Christopher Jung; Rui Li; Xue Feng; Han Zhao

arXiv:2603.28739·cs.LG·March 31, 2026

Expectation Error Bounds for Transfer Learning in Linear Regression and Linear Neural Networks

Meitong Liu, Christopher Jung, Rui Li, Xue Feng, Han Zhao

PDF

TL;DR

This paper provides theoretical bounds and conditions under which transfer learning improves generalization in linear regression and neural networks, supported by new mathematical tools and empirical validation.

Contribution

It derives exact error expressions and optimal task weights for linear models, and establishes the first non-vacuous bounds for beneficial transfer in linear neural networks.

Findings

01

Exact generalization error formulas for linear regression with auxiliary tasks.

02

Optimal task weights derived from solvable optimization problems.

03

Non-vacuous sufficient conditions for transfer learning benefits in linear neural networks.

Abstract

In transfer learning, the learner leverages auxiliary data to improve generalization on a main task. However, the precise theoretical understanding of when and how auxiliary data help remains incomplete. We provide new insights on this issue in two canonical linear settings: ordinary least squares regression and under-parameterized linear neural networks. For linear regression, we derive exact closed-form expressions for the expected generalization error with bias-variance decomposition, yielding necessary and sufficient conditions for auxiliary tasks to improve generalization on the main task. We also derive globally optimal task weights as outputs of solvable optimization programs, with consistency guarantees for empirical estimates. For linear neural networks with shared representations of width $q \leq K$ , where $K$ is the number of auxiliary tasks, we derive a non-asymptotic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.