A Semi-supervised CART Model for Covariate Shift
Mingyang Cai, Thomas Klausch, Mark A. van de Wiel

TL;DR
This paper presents a semi-supervised CART model that uses importance weighting to effectively handle covariate shift, improving predictive accuracy in medical and complex datasets.
Contribution
It introduces a weighted CART framework that addresses covariate shift without target outcomes and extends it to generalized linear model trees and ensembles.
Findings
Significant accuracy improvements in simulations
Enhanced predictive performance on real medical data
Versatile framework applicable to complex datasets
Abstract
Machine learning models used in medical applications often face challenges due to the covariate shift, which occurs when there are discrepancies between the distributions of training and target data. This can lead to decreased predictive accuracy, especially with unknown outcomes in the target data. This paper introduces a semi-supervised classification and regression tree (CART) that uses importance weighting to address these distribution discrepancies. Our method improves the predictive performance of the CART model by assigning greater weights to training samples that more accurately represent the target distribution, especially in cases of covariate shift without target outcomes. In addition to CART, we extend this weighted approach to generalized linear model trees and tree ensembles, creating a versatile framework for managing the covariate shift in complex datasets. Through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Simulation Techniques and Applications
