Optimal Ensemble Construction for Multi-Study Prediction with   Applications to COVID-19 Excess Mortality Estimation

Gabriel Loewinger; Rolando Acosta Nunez; Rahul Mazumder; Giovanni; Parmigiani

arXiv:2109.09164·stat.ML·October 5, 2021

Optimal Ensemble Construction for Multi-Study Prediction with Applications to COVID-19 Excess Mortality Estimation

Gabriel Loewinger, Rolando Acosta Nunez, Rahul Mazumder, Giovanni, Parmigiani

PDF

Open Access 1 Repo

TL;DR

This paper introduces an optimal ensemble construction method for multi-study prediction tasks, improving out-of-study generalization especially in heterogeneous biomedical datasets like COVID-19 mortality prediction.

Contribution

It proposes a joint estimation approach for ensemble weights and study-specific model parameters, unifying and extending existing multi-study stacking and pooling methods.

Findings

01

Outperforms standard methods in COVID-19 mortality prediction.

02

Improves prediction accuracy with limited data from new countries.

03

Remains competitive across various heterogeneity levels.

Abstract

It is increasingly common to encounter prediction tasks in the biomedical sciences for which multiple datasets are available for model training. Common approaches such as pooling datasets and applying standard statistical learning methods can result in poor out-of-study prediction performance when datasets are heterogeneous. Theoretical and applied work has shown $multi-study ensembling$ to be a viable alternative that leverages the variability across datasets in a manner that promotes model generalizability. Multi-study ensembling uses a two-stage $stacking$ strategy which fits study-specific models and estimates ensemble weights separately. This approach ignores, however, the ensemble properties at the model-fitting stage, potentially resulting in a loss of efficiency. We therefore propose $optimal ensemble construction$ , an $all-in-one$ approach to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gloewing/oec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · COVID-19 diagnosis using AI · Artificial Intelligence in Healthcare