JigSaw: A tool for discovering explanatory high-order interactions from random forests
Demetrius DiMucci

TL;DR
JigSaw is a novel algorithm that leverages random forest structures to discover high-order feature interactions, aiding biological data interpretation and hypothesis generation.
Contribution
The paper introduces JigSaw, a new method for identifying explanatory high-order interactions from random forests, validated through simulations and real-world biological datasets.
Findings
Successfully recovered ground truth patterns in simulations.
Identified key interactions explaining heart disease and breast cancer.
Achieved high precision and coverage in real-world datasets.
Abstract
Machine learning is revolutionizing biology by facilitating the prediction of outcomes from complex patterns found in massive data sets. Large biological data sets, like those generated by transcriptome or microbiome studies,measure many relevant components that interact in vivo with one another in modular ways.Identifying the high-order interactions that machine learning models use to make predictions would facilitate the development of hypotheses linking combinations of measured components to outcome. By using the structure of random forests, a new algorithmic approach, termed JigSaw,was developed to aid in the discovery of patterns that could explain predictions made by the forest. By examining the patterns of individual decision trees JigSaw identifies high-order interactions between measured features that are strongly associated with a particular outcome and identifies the relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Neural Networks and Applications · Metabolomics and Mass Spectrometry Studies
MethodsJigsaw
