Tree-Based Predictive Models for Noisy Input Data
Kevin McCoy, Zachary Wooten, Christine B. Peterson

TL;DR
This paper introduces meBART, an extension of Bayesian additive regression trees, designed to directly incorporate measurement error in predictors, leading to improved accuracy and uncertainty quantification in noisy data scenarios.
Contribution
The paper presents meBART, a novel method that extends BART to handle measurement error in predictors, enhancing predictive accuracy and uncertainty estimation in complex models.
Findings
meBART outperforms existing models in noisy data simulations.
The method provides more reliable uncertainty quantification.
Applications demonstrate improved predictions in biomedical data with measurement error.
Abstract
Measurement error is prevalent across all domains of scientific research where only imprecise observations, rather than the true underlying values, can be obtained. For example, estimates of human microbiome diversity are based on small samples from a much larger, generally unobserved system and reflect both sampling error and technical variation. In high-noise settings like these, it becomes difficult to make accurate predictions and to summarize uncertainty. Methods have previously been proposed to accommodate measurement error in classic predictive models, such as linear regression. However, relatively little work has been done to address measurement error in more complex and flexible models. Bayesian additive regression trees (BART), a Bayesian nonparametric model that sums the output of many decision trees, offers robust predictions with built-in uncertainty quantification. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bayesian Modeling and Causal Inference · Statistical Methods and Inference
