Provable Target Sample Complexity Improvements as Pre-Trained Models Scale

Kazuto Fukuchi; Ryuichiro Hataya; Kota Matsui

arXiv:2602.04233·stat.ML·February 5, 2026

Provable Target Sample Complexity Improvements as Pre-Trained Models Scale

Kazuto Fukuchi, Ryuichiro Hataya, Kota Matsui

PDF

Open Access

TL;DR

This paper introduces a theoretical framework called caulking that explains how larger pre-trained models reduce the sample complexity needed for downstream tasks, aligning with empirical scaling laws.

Contribution

The paper provides the first theoretical justification for the observed reduction in sample complexity as pre-trained models scale, using a novel framework inspired by PEFT methods.

Findings

01

Improved pre-trained models provably decrease downstream sample complexity.

02

Theoretical analysis aligns with empirical scaling laws.

03

Framework offers insights into parameter-efficient fine-tuning effects.

Abstract

Pre-trained models have become indispensable for efficiently building models across a broad spectrum of downstream tasks. The advantages of pre-trained models have been highlighted by empirical studies on scaling laws, which demonstrate that larger pre-trained models can significantly reduce the sample complexity of downstream learning. However, existing theoretical investigations of pre-trained models lack the capability to explain this phenomenon. In this paper, we provide a theoretical investigation by introducing a novel framework, caulking, inspired by parameter-efficient fine-tuning (PEFT) methods such as adapter-based fine-tuning, low-rank adaptation, and partial fine-tuning. Our analysis establishes that improved pre-trained models provably decrease the sample complexity of downstream tasks, thereby offering theoretical justification for the empirically observed scaling laws…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques