Interpreting Microbiome Relative Abundance Data Using Symbolic Regression
Swagatam Haldar, Christoph Stein-Thoeringer, Vadim Borisov

TL;DR
This study applies symbolic regression to microbiome data related to colorectal cancer, demonstrating that it offers comparable predictive performance to traditional models while providing superior interpretability for biological insights.
Contribution
The paper introduces the use of symbolic regression for microbiome data analysis, highlighting its interpretability and ability to elucidate biological relationships in colorectal cancer.
Findings
SR competes well with traditional models in predictive accuracy
SR provides explicit mathematical expressions for biological insights
SR helps interpret complex models like XGBoost
Abstract
Understanding the complex interactions within the microbiome is crucial for developing effective diagnostic and therapeutic strategies. Traditional machine learning models often lack interpretability, which is essential for clinical and biological insights. This paper explores the application of symbolic regression (SR) to microbiome relative abundance data, with a focus on colorectal cancer (CRC). SR, known for its high interpretability, is compared against traditional machine learning models, e.g., random forest, gradient boosting decision trees. These models are evaluated based on performance metrics such as F1 score and accuracy. We utilize 71 studies encompassing, from various cohorts, over 10,000 samples across 749 species features. Our results indicate that SR not only competes reasonably well in terms of predictive performance, but also excels in model interpretability. SR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
MethodsFocus
