Combining pre-trained models via localized model averaging

Ziwen Gao; Baihua He; Yuhong Yang

arXiv:2605.13421·stat.ME·May 14, 2026

Combining pre-trained models via localized model averaging

Ziwen Gao, Baihua He, Yuhong Yang

PDF

TL;DR

This paper introduces a localized model averaging approach that adaptively combines pre-trained models based on input covariates, improving performance across diverse tasks.

Contribution

It proposes a flexible, covariate-dependent weighting scheme for model averaging, with theoretical guarantees and broad applicability to various prediction tasks.

Findings

01

The method achieves asymptotic optimality for in-sample and out-of-sample risks.

02

Estimated weights are shown to be consistent.

03

Numerical experiments demonstrate the method's effectiveness.

Abstract

Many pre-trained models (PTMs) are available in modern applications. Because different PTMs are often trained on different datasets, their performances can vary substantially for different new tasks, and the ranking of the candidates may depend heavily on the input. Motivated by this, we propose a localized model averaging method with weights modeled as functions of the covariates, making it substantially more versatile than existing model averaging methods. This formulation allows the model averaging procedure to adaptively capture the varying relative advantages of different PTMs across heterogeneous contexts. Specifically, we learn flexible local weights under a general loss framework that accommodates a broad class of prediction tasks. We further establish the asymptotic optimality of the proposed method for both in-sample and out-of-sample risks, as well as the consistency of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.