Graph-guided random forest for gene set selection
Bastian Pfeifer, Hubert Baniecki, Anna Saranti, Przemyslaw Biecek and, Andreas Holzinger

TL;DR
This paper introduces a graph-guided random forest method that leverages domain-specific network knowledge for gene set selection, enhancing interpretability and effectiveness in bioinformatics and systems biology applications.
Contribution
The paper presents a novel Greedy Decision Forest algorithm that incorporates network information for subnetwork detection, improving interpretability and applicability across various research domains.
Findings
Effective detection of disease-related network modules from multi-omics data.
Enhanced interpretability of machine learning models in biological contexts.
Applicable to diverse domains beyond biomedicine.
Abstract
Machine learning methods can detect complex relationships between variables, but usually do not exploit domain knowledge. This is a limitation because in many scientific disciplines, such as systems biology, domain knowledge is available in the form of graphs or networks, and its use can improve model performance. We need network-based algorithms that are versatile and applicable in many research areas. In this work, we demonstrate subnetwork detection based on multi-modal node features using a novel Greedy Decision Forest with inherent interpretability. The latter will be a crucial factor to retain experts and gain their trust in such algorithms. To demonstrate a concrete application example, we focus on bioinformatics, systems biology and particularly biomedicine, but the presented methodology is applicable in many other domains as well. Systems biology is a good example of a field in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Gene Regulatory Network Analysis
