Combining pre-trained models via localized model averaging
Ziwen Gao, Baihua He, Yuhong Yang

TL;DR
This paper introduces a localized model averaging approach that adaptively combines pre-trained models based on input covariates, improving performance across diverse tasks.
Contribution
It proposes a flexible, covariate-dependent weighting scheme for model averaging, with theoretical guarantees and broad applicability to various prediction tasks.
Findings
The method achieves asymptotic optimality for in-sample and out-of-sample risks.
Estimated weights are shown to be consistent.
Numerical experiments demonstrate the method's effectiveness.
Abstract
Many pre-trained models (PTMs) are available in modern applications. Because different PTMs are often trained on different datasets, their performances can vary substantially for different new tasks, and the ranking of the candidates may depend heavily on the input. Motivated by this, we propose a localized model averaging method with weights modeled as functions of the covariates, making it substantially more versatile than existing model averaging methods. This formulation allows the model averaging procedure to adaptively capture the varying relative advantages of different PTMs across heterogeneous contexts. Specifically, we learn flexible local weights under a general loss framework that accommodates a broad class of prediction tasks. We further establish the asymptotic optimality of the proposed method for both in-sample and out-of-sample risks, as well as the consistency of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
