Transfer learning of regression models from a sequence of datasets by penalized estimation
Wessel N. van Wieringen, Harald Binder

TL;DR
This paper introduces a penalized transfer learning method for sequential regression datasets, enabling model updates with partial covariate overlap, ensuring asymptotic unbiasedness and consistency, and optimizing penalty via cross-validation.
Contribution
The paper proposes a novel penalized estimation procedure for transfer learning in regression, accommodating sequential datasets with partial covariate overlap and providing theoretical guarantees.
Findings
Method achieves asymptotic unbiasedness and consistency.
Penalty optimization via constrained cross-validation improves model fit.
Application demonstrates effectiveness in epidemiological data with batch-wise covariate changes.
Abstract
Transfer learning refers to the promising idea of initializing model fits based on pre-training on other data. We particularly consider regression modeling settings where parameter estimates from previous data can be used as anchoring points, yet may not be available for all parameters, thus covariance information cannot be reused. A procedure that updates through targeted penalized estimation, which shrinks the estimator towards a nonzero value, is presented. The parameter estimate from the previous data serves as this nonzero value when an update is sought from novel data. This naturally extends to a sequence of data sets with the same response, but potentially only partial overlap in covariates. The iteratively updated regression parameter estimator is shown to be asymptotically unbiased and consistent. The penalty parameter is chosen through constrained cross-validated loglikelihood…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Control Systems and Identification · Statistical Methods and Inference
