Group Probability-Weighted Tree Sums for Interpretable Modeling of Heterogeneous Data
Keyan Nasseri, Chandan Singh, James Duncan, Aaron Kornblith, Bin Yu

TL;DR
This paper introduces G-FIGS, a novel instance-weighted tree-sum method that improves predictive accuracy and interpretability for heterogeneous data, especially in healthcare, by leveraging group membership probabilities.
Contribution
The paper presents G-FIGS, a new method combining group probability estimation with decision tree sums to enhance modeling of diverse data while maintaining interpretability.
Findings
G-FIGS outperforms CART and FIGS in clinical datasets.
G-FIGS increases specificity by up to 10% at fixed sensitivity.
Final models remain interpretable with fewer than 16 rules.
Abstract
Machine learning in high-stakes domains, such as healthcare, faces two critical challenges: (1) generalizing to diverse data distributions given limited training data while (2) maintaining interpretability. To address these challenges, we propose an instance-weighted tree-sum method that effectively pools data across diverse groups to output a concise, rule-based model. Given distinct groups of instances in a dataset (e.g., medical patients grouped by age or treatment site), our method first estimates group membership probabilities for each instance. Then, it uses these estimates as instance weights in FIGS (Tan et al. 2022), to grow a set of decision trees whose values sum to the final prediction. We call this new method Group Probability-Weighted Tree Sums (G-FIGS). G-FIGS achieves state-of-the-art prediction performance on important clinical datasets; e.g., holding the level of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Medical Imaging and Analysis
