Regression Trees and Random forest based feature selection for malaria risk exposure prediction
Bienvenue Kouway\`e

TL;DR
This study employs regression trees and random forests to automatically select environmental and climate variables for predicting malaria vector exposure, resulting in improved prediction accuracy and efficiency over traditional methods.
Contribution
It introduces a novel variable selection approach using regression trees and random forests with stratified cross-validation for malaria risk prediction.
Findings
Better prediction accuracy than GLM-Lasso.
Reduced computational time.
Effective variable importance assessment.
Abstract
This paper deals with prediction of anopheles number, the main vector of malaria risk, using environmental and climate variables. The variables selection is based on an automatic machine learning method using regression trees, and random forests combined with stratified two levels cross validation. The minimum threshold of variables importance is accessed using the quadratic distance of variables importance while the optimal subset of selected variables is used to perform predictions. Finally the results revealed to be qualitatively better, at the selection, the prediction , and the CPU time point of view than those obtained by GLM-Lasso method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · Face and Expression Recognition · Machine Learning and Data Classification
