An ensemble approach to improved prediction from multitype data
Jennifer Clarke, David Seo

TL;DR
This paper presents an ensemble method combining LASSO, logic trees, and SVMs to enhance prediction accuracy from multitype data, demonstrated on SNP and clinical data for heart disease risk.
Contribution
It introduces a novel ensemble strategy integrating multiple modeling approaches and SVM-based subspace identification for improved outcome prediction.
Findings
Enhanced prediction accuracy for coronary heart disease.
Effective integration of binary and non-binary data models.
Demonstrated utility in genetic and clinical data analysis.
Abstract
We have developed a strategy for the analysis of newly available binary data to improve outcome predictions based on existing data (binary or non-binary). Our strategy involves two modeling approaches for the newly available data, one combining binary covariate selection via LASSO with logistic regression and one based on logic trees. The results of these models are then compared to the results of a model based on existing data with the objective of combining model results to achieve the most accurate predictions. The combination of model predictions is aided by the use of support vector machines to identify subspaces of the covariate space in which specific models lead to successful predictions. We demonstrate our approach in the analysis of single nucleotide polymorphism (SNP) data and traditional clinical risk factors for the prediction of coronary heart disease.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
