metboost: Exploratory regression analysis with hierarchically clustered data
Patrick J. Miller, Daniel B. McArtor, Gitta H. Lubke

TL;DR
Metboost extends boosted decision trees to hierarchically clustered data, enabling nonlinear and group-specific effects modeling, improving prediction and variable selection in large, complex datasets.
Contribution
The paper introduces metboost, a novel method that constrains tree structures across groups while allowing group-specific terminal node means, enhancing exploratory regression analysis for hierarchical data.
Findings
Metboost improves prediction accuracy by 15% over boosted decision trees.
Metboost enhances variable selection performance by up to 70%.
The method remains computationally feasible for large datasets.
Abstract
As data collections become larger, exploratory regression analysis becomes more important but more challenging. When observations are hierarchically clustered the problem is even more challenging because model selection with mixed effect models can produce misleading results when nonlinear effects are not included into the model (Bauer and Cai, 2009). A machine learning method called boosted decision trees (Friedman, 2001) is a good approach for exploratory regression analysis in real data sets because it can detect predictors with nonlinear and interaction effects while also accounting for missing data. We propose an extension to boosted decision decision trees called metboost for hierarchically clustered data. It works by constraining the structure of each tree to be the same across groups, but allowing the terminal node means to differ. This allows predictors and split points to lead…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Advanced Statistical Methods and Models
