Sample Complexity of Transfer Learning: An Optimal Transport Approach

Haoyang Cao; Xin Guo; Wenpin Tang; Guan Wang

arXiv:2605.20545·stat.ML·May 21, 2026

Sample Complexity of Transfer Learning: An Optimal Transport Approach

Haoyang Cao, Xin Guo, Wenpin Tang, Guan Wang

PDF

TL;DR

This paper provides a theoretical analysis of transfer learning's sample efficiency using optimal transport, showing it can outperform direct learning especially in high-dimensional, complex models, supported by numerical experiments.

Contribution

It introduces a rigorous optimal transport-based framework to analyze transfer learning's sample complexity, revealing conditions where transfer learning is more efficient.

Findings

01

Transfer learning has better sample complexity when data dimension exceeds 3.

02

The sample complexity for transfer learning scales as O(m^{-(α+1)/d}), indicating improved efficiency.

03

Numerical experiments on image classification demonstrate significant performance gains in data-scarce regimes.

Abstract

Transfer learning is an essential technique for many machine learning/AI models of complex structures such as large language models and generative AI. The essence of transfer learning is to leverage knowledge from resolved source tasks for a new target task, especially when the sample size $m$ of the training data for the latter is low. In this work, we rigorously analyze the potential benefit of transfer learning in terms of sample efficiency. Specifically, taking an optimal transport viewpoint of transfer learning, we find that when the data dimension $d$ is higher than $3$ , the sample complexity for transfer learning is $O (m^{- (α + 1) / d})$ , with $α$ indicating the smoothness of the data distribution, as opposed to the $O (m^{- p / d})$ sample complexity for direct learning with $p$ indicating the smoothness of the optimal target model. Our finding theoretically supports a better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.