Semiparametric Mixed-Scale Models Using Shared Bayesian Forests
Antonio R. Linero, Debajyoti Sinha, and Stuart R. Lipsitz

TL;DR
This paper introduces a Bayesian nonparametric approach using shared forests for multivariate and semi-continuous regression, improving inference by sharing information across model components, especially in high-dimensional sparse settings.
Contribution
It develops novel shared Bayesian forest models for multivariate and semi-continuous responses, enabling nonparametric information sharing across components, and applies these to complex medical expenditure data.
Findings
Sharing information improves model performance in high-dimensional settings.
The proposed models effectively analyze heteroscedastic and semi-continuous data.
Application to MEPS data demonstrates practical utility and interpretability.
Abstract
This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semi-continuous responses. In this paper, we present methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum-of-tree models. Our simulation results demonstrate that sharing of information across related model components is often very beneficial, particularly in sparse high-dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
